coco_pipe.dim_reduction.core ============================ .. py:module:: coco_pipe.dim_reduction.core .. autoapi-nested-parse:: Dimensionality Reduction Core ============================= Execution manager for one dimensionality reduction method. `DimReduction` is intentionally narrow. It owns reducer instantiation, input-shape validation for execution, fit/transform operations, and cached evaluation/interpretation state for one reducer instance. Plotting, trajectory reshaping, reporting, and multi-method comparison live in dedicated modules. Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca) Classes ------- .. autoapisummary:: coco_pipe.dim_reduction.core.DimReduction Module Contents --------------- .. py:class:: DimReduction(method: Union[str, coco_pipe.dim_reduction.config.BaseReducerConfig], n_components: int = 2, params: Optional[Dict[str, Any]] = None, **kwargs) Manage one dimensionality reduction workflow. :param method: Canonical public reducer name or a typed configuration object. Method names are exact and must match the registry, for example ``"PCA"``, ``"Isomap"``, ``"Pacmap"``, or ``"TopologicalAE"``. :type method: str or BaseReducerConfig :param n_components: Target dimensionality when ``method`` is a string. :type n_components: int, default=2 :param params: Additional reducer keyword arguments merged into the constructor arguments when ``method`` is a string. :type params: dict, optional :param \*\*kwargs: Runtime reducer keyword overrides. These are merged after ``params``. :type \*\*kwargs: dict .. attribute:: method Canonical reducer name. :type: str .. attribute:: n_components Target dimensionality used for the reducer instance. :type: int .. attribute:: reducer Instantiated reducer backend. :type: BaseReducer .. attribute:: metrics_ Cached scalar evaluation summaries from the latest ``score()`` call. :type: dict .. attribute:: quality_metadata_ Cached scalar reducer metadata exposed through the reducer contract. :type: dict .. attribute:: diagnostics_ Cached non-scalar diagnostic artifacts exposed through the reducer contract or the evaluation layer. :type: dict .. attribute:: metric_records_ Cached tidy metric observations produced by the evaluator. :type: list of dict .. attribute:: interpretation_ Cached feature interpretation payloads from the latest ``interpret()`` call. :type: dict .. attribute:: interpretation_records_ Cached tidy feature-interpretation observations. :type: list of dict .. seealso:: :obj:`coco_pipe.dim_reduction.analysis.interpret_features` Pure interpretation backend used by ``interpret()``. :obj:`coco_pipe.dim_reduction.evaluation.core.evaluate_embedding` Pure evaluator used by ``score()``. :obj:`coco_pipe.dim_reduction.evaluation.core.MethodSelector` Post-hoc comparison and ranking over already-scored reducers. :obj:`coco_pipe.viz.dim_reduction` Plotting utilities for embeddings, metrics, and diagnostics. .. rubric:: Examples >>> reducer = DimReduction("UMAP", n_components=2, n_neighbors=15) >>> embedding = reducer.fit_transform(X) >>> scores = reducer.score(embedding, X=X) >>> "trustworthiness" in scores["metrics"] True >>> interpretation = reducer.interpret( ... X, ... X_emb=embedding, ... analyses=["correlation"], ... feature_names=feature_names, ... ) >>> "correlation" in interpretation["analysis"] True .. py:attribute:: reducer_kwargs .. py:attribute:: reducer :type: coco_pipe.dim_reduction.reducers.base.BaseReducer .. py:attribute:: metrics_ :type: Dict[str, Any] .. py:attribute:: quality_metadata_ :type: Dict[str, Any] .. py:attribute:: diagnostics_ :type: Dict[str, Any] .. py:attribute:: metric_records_ :type: List[Dict[str, Any]] :value: [] .. py:attribute:: interpretation_ :type: Dict[str, Any] .. py:attribute:: interpretation_records_ :type: List[Dict[str, Any]] :value: [] .. py:property:: random_state :type: Optional[int] Return the random seed from parameters if any. .. py:property:: capabilities :type: Dict[str, Any] Return reducer capability metadata through the manager interface. .. py:method:: _reset_cached_outputs() -> None Clear cached evaluation outputs. .. py:method:: _validate_input(X: Any) -> numpy.ndarray Validate reducer input shape and coerce to a NumPy array. :param X: Input data accepted by the reducer. Objects exposing ``get_data()`` are unwrapped before validation. :type X: array-like or MNE object :returns: **X** -- Validated reducer input. :rtype: np.ndarray :raises ValueError: If the input dimensionality does not match the reducer contract. .. py:method:: fit(X: Any, y: Optional[Any] = None) -> DimReduction Fit the reducer on the provided data. :param X: Input data in the reducer's native layout. :type X: array-like or MNE object :param y: Optional supervision forwarded to the reducer. :type y: array-like, optional :returns: **self** -- The fitted reducer. :rtype: DimReduction .. py:method:: transform(X: Any) -> numpy.ndarray Transform new data with a fitted reducer. :param X: Input data in the reducer's native layout. :type X: array-like or MNE object :returns: **X_emb** -- Reduced representation returned by the reducer. :rtype: np.ndarray .. py:method:: fit_transform(X: Any, y: Optional[Any] = None) -> numpy.ndarray Fit the reducer and return the reduced representation. :param X: Input data in the reducer's native layout. :type X: array-like or MNE object :param y: Optional supervision forwarded to the reducer. :type y: array-like, optional :returns: **X_emb** -- Reduced representation returned by the reducer. :rtype: np.ndarray .. py:method:: get_components() -> numpy.ndarray Return reducer-defined component-like outputs. :returns: **components** -- Component-like array exposed by the reducer. :rtype: np.ndarray :raises ValueError: If the reducer does not expose public components. .. py:method:: score(X_emb: numpy.ndarray, X: Any = None, n_neighbors: int = 5, metrics: Optional[List[str]] = None, k_values: Optional[List[int]] = None, labels: Optional[numpy.ndarray] = None, groups: Optional[numpy.ndarray] = None, times: Optional[numpy.ndarray] = None, separation_method: str = 'centroid') -> Dict[str, Dict[str, Any]] Evaluate an explicit embedding against the original data. :param X_emb: Embedded data to evaluate. :type X_emb: array-like :param X: Original high-dimensional data in evaluation-ready layout. This is required for standard 2D metrics and optional for native 3D trajectory metrics. :type X: array-like, optional :param n_neighbors: K-nearest neighbors size for metric computation. :type n_neighbors: int, default=5 :param metrics: Metric selectors to compute. ``None`` evaluates all metric families available for the embedding shape. :type metrics: list of str, optional :param k_values: Neighborhood sizes used for multi-scale standard metric evaluation. :type k_values: list of int, optional :param labels: Optional labels aligned with the embedding. Used for trajectory separation when ``X_emb`` is 3D and for explicit supervised 2D metrics when requested. :type labels: np.ndarray, optional :param groups: Optional grouping variable aligned with the embedding. Required by grouped supervised evaluation metrics such as ``separation_logreg_balanced_accuracy``. :type groups: np.ndarray, optional :param times: Optional trajectory time coordinates aligned with the trajectory length axis. :type times: np.ndarray, optional :param separation_method: Separation definition passed to trajectory evaluation when labels are available for native 3D trajectory embeddings. :type separation_method: str, default="centroid" :returns: **scores** -- Dictionary with keys ``"metrics"``, ``"metadata"``, and ``"diagnostics"``. :rtype: dict .. rubric:: Notes ``score()`` does not infer or cache embeddings. Callers must pass ``X_emb`` explicitly. ``X`` is only required when the requested evaluation path needs the original high-dimensional samples. .. py:method:: interpret(X: numpy.ndarray, *, X_emb: numpy.ndarray, analyses: Optional[List[str]] = None, feature_names: Optional[List[str]] = None, n_repeats: int = 5, random_state: Optional[int] = None) -> Dict[str, Any] Run feature interpretation analyses for an explicit embedding. :param X: Original input data. :type X: np.ndarray :param X_emb: Explicit embedding aligned with ``X``. :type X_emb: np.ndarray :param analyses: Interpretation analyses to compute. ``None`` defaults to ``["correlation"]``. :type analyses: list of {"correlation", "perturbation", "gradient"}, optional :param feature_names: Feature names aligned with the columns of ``X`` when the requested interpretation returns feature-keyed outputs. :type feature_names: list of str, optional :param n_repeats: Number of shuffles per feature for perturbation importance. :type n_repeats: int, default=5 :param random_state: Random seed for perturbation importance. :type random_state: int, optional :returns: Dictionary with keys ``"analysis"`` and ``"records"``. :rtype: dict .. rubric:: Notes ``interpret()`` does not fit the reducer or compute embeddings. Callers must pass both ``X`` and ``X_emb`` explicitly. .. seealso:: :obj:`coco_pipe.dim_reduction.analysis.interpret_features` Pure interpretation backend used by this manager method. :obj:`score` Evaluate structure-preservation metrics for an explicit embedding. .. rubric:: Examples >>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> result = reducer.interpret( ... X, ... X_emb=embedding, ... analyses=["correlation"], ... feature_names=feature_names, ... ) >>> sorted(result) ['analysis', 'records'] .. py:method:: get_diagnostics() -> Dict[str, Any] Return cached diagnostics merged with reducer diagnostics. :returns: **diagnostics** -- Diagnostic artifacts declared by the reducer contract and the evaluation layer. :rtype: dict .. py:method:: get_quality_metadata() -> Dict[str, Any] Return cached scalar metadata merged with reducer metadata. :returns: **metadata** -- Scalar metadata declared by the reducer contract and the evaluation layer. :rtype: dict .. py:method:: get_metrics() -> Dict[str, Any] Return cached scalar metrics from the latest ``score()`` call. .. py:method:: get_summary() -> Dict[str, Any] Return a normalized summary payload for report and export paths. :returns: Plain dictionary containing method identity, cached scalar summaries, reducer metadata, diagnostics, tidy metric records, and capability flags, plus cached feature interpretation payloads. :rtype: dict .. rubric:: Notes The summary does not include an embedding payload. Embeddings are handled explicitly outside the manager and must be passed directly to plotting or reporting utilities that need them. .. py:method:: save(path: Union[str, pathlib.Path]) Save the underlying reducer to disk. :param path: Output path for reducer persistence. :type path: str or Path .. rubric:: Notes Only the reducer model is persisted. Cached manager state such as metrics and diagnostics is not included. .. py:method:: load(path: Union[str, pathlib.Path], method: str) -> DimReduction :classmethod: Load a persisted reducer and wrap it in a fresh manager. :param path: Path to a serialized reducer saved with ``save()``. :type path: str or Path :param method: Canonical public reducer name used to reconstruct the manager. :type method: str :returns: Fresh manager wrapping the loaded reducer model. :rtype: DimReduction .. rubric:: Notes This restores the reducer model only. Cached manager state such as scores, diagnostics, and metric records is not persisted.