coco_pipe.dim_reduction.core

Execution manager for one dimensionality reduction method.

DimReduction is intentionally narrow. It owns reducer instantiation, input-shape validation for execution, fit/transform operations, and cached evaluation/interpretation state for one reducer instance. Plotting, trajectory reshaping, reporting, and multi-method comparison live in dedicated modules.

Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca)

Classes

DimReduction

Manage one dimensionality reduction workflow.

Module Contents

class coco_pipe.dim_reduction.core.DimReduction(method: str | coco_pipe.dim_reduction.config.BaseReducerConfig, n_components: int = 2, params: Dict[str, Any] | None = None, **kwargs)[source]

Manage one dimensionality reduction workflow.

Parameters:
  • method (str or BaseReducerConfig) – Canonical public reducer name or a typed configuration object. Method names are exact and must match the registry, for example "PCA", "Isomap", "Pacmap", or "TopologicalAE".

  • n_components (int, default=2) – Target dimensionality when method is a string.

  • params (dict, optional) – Additional reducer keyword arguments merged into the constructor arguments when method is a string.

  • **kwargs (dict) – Runtime reducer keyword overrides. These are merged after params.

method

Canonical reducer name.

Type:

str

n_components

Target dimensionality used for the reducer instance.

Type:

int

reducer

Instantiated reducer backend.

Type:

BaseReducer

metrics_

Cached scalar evaluation summaries from the latest score() call.

Type:

dict

quality_metadata_

Cached scalar reducer metadata exposed through the reducer contract.

Type:

dict

diagnostics_

Cached non-scalar diagnostic artifacts exposed through the reducer contract or the evaluation layer.

Type:

dict

metric_records_

Cached tidy metric observations produced by the evaluator.

Type:

list of dict

interpretation_

Cached feature interpretation payloads from the latest interpret() call.

Type:

dict

interpretation_records_

Cached tidy feature-interpretation observations.

Type:

list of dict

See also

coco_pipe.dim_reduction.analysis.interpret_features

Pure interpretation backend used by interpret().

coco_pipe.dim_reduction.evaluation.core.evaluate_embedding

Pure evaluator used by score().

coco_pipe.dim_reduction.evaluation.core.MethodSelector

Post-hoc comparison and ranking over already-scored reducers.

coco_pipe.viz.dim_reduction

Plotting utilities for embeddings, metrics, and diagnostics.

Examples

>>> reducer = DimReduction("UMAP", n_components=2, n_neighbors=15)
>>> embedding = reducer.fit_transform(X)
>>> scores = reducer.score(embedding, X=X)
>>> "trustworthiness" in scores["metrics"]
True
>>> interpretation = reducer.interpret(
...     X,
...     X_emb=embedding,
...     analyses=["correlation"],
...     feature_names=feature_names,
... )
>>> "correlation" in interpretation["analysis"]
True
reducer_kwargs
reducer: coco_pipe.dim_reduction.reducers.base.BaseReducer
metrics_: Dict[str, Any]
quality_metadata_: Dict[str, Any]
diagnostics_: Dict[str, Any]
metric_records_: List[Dict[str, Any]] = []
interpretation_: Dict[str, Any]
interpretation_records_: List[Dict[str, Any]] = []
property random_state: int | None

Return the random seed from parameters if any.

property capabilities: Dict[str, Any]

Return reducer capability metadata through the manager interface.

_reset_cached_outputs() None[source]

Clear cached evaluation outputs.

_validate_input(X: Any) numpy.ndarray[source]

Validate reducer input shape and coerce to a NumPy array.

Parameters:

X (array-like or MNE object) – Input data accepted by the reducer. Objects exposing get_data() are unwrapped before validation.

Returns:

X – Validated reducer input.

Return type:

np.ndarray

Raises:

ValueError – If the input dimensionality does not match the reducer contract.

fit(X: Any, y: Any | None = None) DimReduction[source]

Fit the reducer on the provided data.

Parameters:
  • X (array-like or MNE object) – Input data in the reducer’s native layout.

  • y (array-like, optional) – Optional supervision forwarded to the reducer.

Returns:

self – The fitted reducer.

Return type:

DimReduction

transform(X: Any) numpy.ndarray[source]

Transform new data with a fitted reducer.

Parameters:

X (array-like or MNE object) – Input data in the reducer’s native layout.

Returns:

X_emb – Reduced representation returned by the reducer.

Return type:

np.ndarray

fit_transform(X: Any, y: Any | None = None) numpy.ndarray[source]

Fit the reducer and return the reduced representation.

Parameters:
  • X (array-like or MNE object) – Input data in the reducer’s native layout.

  • y (array-like, optional) – Optional supervision forwarded to the reducer.

Returns:

X_emb – Reduced representation returned by the reducer.

Return type:

np.ndarray

get_components() numpy.ndarray[source]

Return reducer-defined component-like outputs.

Returns:

components – Component-like array exposed by the reducer.

Return type:

np.ndarray

Raises:

ValueError – If the reducer does not expose public components.

score(X_emb: numpy.ndarray, X: Any = None, n_neighbors: int = 5, metrics: List[str] | None = None, k_values: List[int] | None = None, labels: numpy.ndarray | None = None, groups: numpy.ndarray | None = None, times: numpy.ndarray | None = None, separation_method: str = 'centroid') Dict[str, Dict[str, Any]][source]

Evaluate an explicit embedding against the original data.

Parameters:
  • X_emb (array-like) – Embedded data to evaluate.

  • X (array-like, optional) – Original high-dimensional data in evaluation-ready layout. This is required for standard 2D metrics and optional for native 3D trajectory metrics.

  • n_neighbors (int, default=5) – K-nearest neighbors size for metric computation.

  • metrics (list of str, optional) – Metric selectors to compute. None evaluates all metric families available for the embedding shape.

  • k_values (list of int, optional) – Neighborhood sizes used for multi-scale standard metric evaluation.

  • labels (np.ndarray, optional) – Optional labels aligned with the embedding. Used for trajectory separation when X_emb is 3D and for explicit supervised 2D metrics when requested.

  • groups (np.ndarray, optional) – Optional grouping variable aligned with the embedding. Required by grouped supervised evaluation metrics such as separation_logreg_balanced_accuracy.

  • times (np.ndarray, optional) – Optional trajectory time coordinates aligned with the trajectory length axis.

  • separation_method (str, default="centroid") – Separation definition passed to trajectory evaluation when labels are available for native 3D trajectory embeddings.

Returns:

scores – Dictionary with keys "metrics", "metadata", and "diagnostics".

Return type:

dict

Notes

score() does not infer or cache embeddings. Callers must pass X_emb explicitly. X is only required when the requested evaluation path needs the original high-dimensional samples.

interpret(X: numpy.ndarray, *, X_emb: numpy.ndarray, analyses: List[str] | None = None, feature_names: List[str] | None = None, n_repeats: int = 5, random_state: int | None = None) Dict[str, Any][source]

Run feature interpretation analyses for an explicit embedding.

Parameters:
  • X (np.ndarray) – Original input data.

  • X_emb (np.ndarray) – Explicit embedding aligned with X.

  • analyses (list of {"correlation", "perturbation", "gradient"}, optional) – Interpretation analyses to compute. None defaults to ["correlation"].

  • feature_names (list of str, optional) – Feature names aligned with the columns of X when the requested interpretation returns feature-keyed outputs.

  • n_repeats (int, default=5) – Number of shuffles per feature for perturbation importance.

  • random_state (int, optional) – Random seed for perturbation importance.

Returns:

Dictionary with keys "analysis" and "records".

Return type:

dict

Notes

interpret() does not fit the reducer or compute embeddings. Callers must pass both X and X_emb explicitly.

See also

coco_pipe.dim_reduction.analysis.interpret_features

Pure interpretation backend used by this manager method.

score

Evaluate structure-preservation metrics for an explicit embedding.

Examples

>>> reducer = DimReduction("PCA", n_components=2)
>>> embedding = reducer.fit_transform(X)
>>> result = reducer.interpret(
...     X,
...     X_emb=embedding,
...     analyses=["correlation"],
...     feature_names=feature_names,
... )
>>> sorted(result)
['analysis', 'records']
get_diagnostics() Dict[str, Any][source]

Return cached diagnostics merged with reducer diagnostics.

Returns:

diagnostics – Diagnostic artifacts declared by the reducer contract and the evaluation layer.

Return type:

dict

get_quality_metadata() Dict[str, Any][source]

Return cached scalar metadata merged with reducer metadata.

Returns:

metadata – Scalar metadata declared by the reducer contract and the evaluation layer.

Return type:

dict

get_metrics() Dict[str, Any][source]

Return cached scalar metrics from the latest score() call.

get_summary() Dict[str, Any][source]

Return a normalized summary payload for report and export paths.

Returns:

Plain dictionary containing method identity, cached scalar summaries, reducer metadata, diagnostics, tidy metric records, and capability flags, plus cached feature interpretation payloads.

Return type:

dict

Notes

The summary does not include an embedding payload. Embeddings are handled explicitly outside the manager and must be passed directly to plotting or reporting utilities that need them.

save(path: str | pathlib.Path)[source]

Save the underlying reducer to disk.

Parameters:

path (str or Path) – Output path for reducer persistence.

Notes

Only the reducer model is persisted. Cached manager state such as metrics and diagnostics is not included.

classmethod load(path: str | pathlib.Path, method: str) DimReduction[source]

Load a persisted reducer and wrap it in a fresh manager.

Parameters:
  • path (str or Path) – Path to a serialized reducer saved with save().

  • method (str) – Canonical public reducer name used to reconstruct the manager.

Returns:

Fresh manager wrapping the loaded reducer model.

Return type:

DimReduction

Notes

This restores the reducer model only. Cached manager state such as scores, diagnostics, and metric records is not persisted.