coco_pipe.dim_reduction.core¶

Execution manager for one dimensionality reduction method.

DimReduction is intentionally narrow. It owns reducer instantiation, input-shape validation for execution, fit/transform operations, and cached evaluation/interpretation state for one reducer instance. Plotting, trajectory reshaping, reporting, and multi-method comparison live in dedicated modules.

Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca)

Classes¶

DimReduction

Manage one dimensionality reduction workflow.

Module Contents¶

class coco_pipe.dim_reduction.core.DimReduction(method: str | coco_pipe.dim_reduction.config.BaseReducerConfig, n_components: int = 2, params: Dict[str, Any] | None = None, **kwargs)[source]¶

Manage one dimensionality reduction workflow.

Parameters:

method (str or BaseReducerConfig) – Canonical public reducer name or a typed configuration object. Method names are exact and must match the registry, for example "PCA", "Isomap", "Pacmap", or "TopologicalAE".
n_components (int, default=2) – Target dimensionality when method is a string.
params (dict, optional) – Additional reducer keyword arguments merged into the constructor arguments when method is a string.
**kwargs (dict) – Runtime reducer keyword overrides. These are merged after params.

method¶

Canonical reducer name.

Type:: str

n_components¶

Target dimensionality used for the reducer instance.

Type:: int

reducer¶

Instantiated reducer backend.

Type:: BaseReducer

metrics_¶

Cached scalar evaluation summaries from the latest score() call.

Type:: dict

quality_metadata_¶

Cached scalar reducer metadata exposed through the reducer contract.

Type:: dict

diagnostics_¶

Cached non-scalar diagnostic artifacts exposed through the reducer contract or the evaluation layer.

Type:: dict

metric_records_¶

Cached tidy metric observations produced by the evaluator.

Type:: list of dict

interpretation_¶

Cached feature interpretation payloads from the latest interpret() call.

Type:: dict

interpretation_records_¶

Cached tidy feature-interpretation observations.

Type:: list of dict

See also

coco_pipe.dim_reduction.analysis.interpret_features: Pure interpretation backend used by interpret().
coco_pipe.dim_reduction.evaluation.core.evaluate_embedding: Pure evaluator used by score().
coco_pipe.dim_reduction.evaluation.core.MethodSelector: Post-hoc comparison and ranking over already-scored reducers.
coco_pipe.viz.dim_reduction: Plotting utilities for embeddings, metrics, and diagnostics.

Examples

>>> reducer = DimReduction("UMAP", n_components=2, n_neighbors=15)
>>> embedding = reducer.fit_transform(X)
>>> scores = reducer.score(embedding, X=X)
>>> "trustworthiness" in scores["metrics"]
True
>>> interpretation = reducer.interpret(
...     X,
...     X_emb=embedding,
...     analyses=["correlation"],
...     feature_names=feature_names,
... )
>>> "correlation" in interpretation["analysis"]
True

reducer_kwargs¶

reducer: coco_pipe.dim_reduction.reducers.base.BaseReducer¶

metrics_: Dict[str, Any]¶

quality_metadata_: Dict[str, Any]¶

diagnostics_: Dict[str, Any]¶

metric_records_: List[Dict[str, Any]] = []¶

interpretation_: Dict[str, Any]¶

interpretation_records_: List[Dict[str, Any]] = []¶

property random_state: int | None¶: Return the random seed from parameters if any.

property capabilities: Dict[str, Any]¶: Return reducer capability metadata through the manager interface.

_reset_cached_outputs() → None[source]¶: Clear cached evaluation outputs.

_validate_input(X: Any) → numpy.ndarray[source]¶

Validate reducer input shape and coerce to a NumPy array.

Parameters:: X (array-like or MNE object) – Input data accepted by the reducer. Objects exposing get_data() are unwrapped before validation.
Returns:: X – Validated reducer input.
Return type:: np.ndarray
Raises:: ValueError – If the input dimensionality does not match the reducer contract.

fit(X: Any, y: Any | None = None) → DimReduction[source]¶

Fit the reducer on the provided data.

Parameters:

X (array-like or MNE object) – Input data in the reducer’s native layout.
y (array-like, optional) – Optional supervision forwarded to the reducer.

Returns:

self – The fitted reducer.

Return type:

DimReduction

transform(X: Any) → numpy.ndarray[source]¶

Transform new data with a fitted reducer.

Parameters:: X (array-like or MNE object) – Input data in the reducer’s native layout.
Returns:: X_emb – Reduced representation returned by the reducer.
Return type:: np.ndarray

fit_transform(X: Any, y: Any | None = None) → numpy.ndarray[source]¶

Fit the reducer and return the reduced representation.

Parameters:

X (array-like or MNE object) – Input data in the reducer’s native layout.
y (array-like, optional) – Optional supervision forwarded to the reducer.

Returns:

X_emb – Reduced representation returned by the reducer.

Return type:

np.ndarray

get_components() → numpy.ndarray[source]¶

Return reducer-defined component-like outputs.

Returns:: components – Component-like array exposed by the reducer.
Return type:: np.ndarray
Raises:: ValueError – If the reducer does not expose public components.

score(X_emb: numpy.ndarray, X: Any = None, n_neighbors: int = 5, metrics: List[str] | None = None, k_values: List[int] | None = None, labels: numpy.ndarray | None = None, groups: numpy.ndarray | None = None, times: numpy.ndarray | None = None, separation_method: str = 'centroid') → Dict[str, Dict[str, Any]][source]¶

Evaluate an explicit embedding against the original data.

Parameters:

X_emb (array-like) – Embedded data to evaluate.
X (array-like, optional) – Original high-dimensional data in evaluation-ready layout. This is required for standard 2D metrics and optional for native 3D trajectory metrics.
n_neighbors (int, default=5) – K-nearest neighbors size for metric computation.
metrics (list of str, optional) – Metric selectors to compute. None evaluates all metric families available for the embedding shape.
k_values (list of int, optional) – Neighborhood sizes used for multi-scale standard metric evaluation.
labels (np.ndarray, optional) – Optional labels aligned with the embedding. Used for trajectory separation when X_emb is 3D and for explicit supervised 2D metrics when requested.
groups (np.ndarray, optional) – Optional grouping variable aligned with the embedding. Required by grouped supervised evaluation metrics such as separation_logreg_balanced_accuracy.
times (np.ndarray, optional) – Optional trajectory time coordinates aligned with the trajectory length axis.
separation_method (str, default="centroid") – Separation definition passed to trajectory evaluation when labels are available for native 3D trajectory embeddings.

Returns:

scores – Dictionary with keys "metrics", "metadata", and "diagnostics".

Return type:

dict

Notes

score() does not infer or cache embeddings. Callers must pass X_emb explicitly. X is only required when the requested evaluation path needs the original high-dimensional samples.

interpret(X: numpy.ndarray, *, X_emb: numpy.ndarray, analyses: List[str] | None = None, feature_names: List[str] | None = None, n_repeats: int = 5, random_state: int | None = None) → Dict[str, Any][source]¶

Run feature interpretation analyses for an explicit embedding.

Parameters:

X (np.ndarray) – Original input data.
X_emb (np.ndarray) – Explicit embedding aligned with X.
analyses (list of {"correlation", "perturbation", "gradient"}, optional) – Interpretation analyses to compute. None defaults to ["correlation"].
feature_names (list of str, optional) – Feature names aligned with the columns of X when the requested interpretation returns feature-keyed outputs.
n_repeats (int, default=5) – Number of shuffles per feature for perturbation importance.
random_state (int, optional) – Random seed for perturbation importance.

Returns:

Dictionary with keys "analysis" and "records".

Return type:

dict

Notes

interpret() does not fit the reducer or compute embeddings. Callers must pass both X and X_emb explicitly.

See also

coco_pipe.dim_reduction.analysis.interpret_features: Pure interpretation backend used by this manager method.
score: Evaluate structure-preservation metrics for an explicit embedding.

Examples

>>> reducer = DimReduction("PCA", n_components=2)
>>> embedding = reducer.fit_transform(X)
>>> result = reducer.interpret(
...     X,
...     X_emb=embedding,
...     analyses=["correlation"],
...     feature_names=feature_names,
... )
>>> sorted(result)
['analysis', 'records']

get_diagnostics() → Dict[str, Any][source]¶

Return cached diagnostics merged with reducer diagnostics.

Returns:: diagnostics – Diagnostic artifacts declared by the reducer contract and the evaluation layer.
Return type:: dict

get_quality_metadata() → Dict[str, Any][source]¶

Return cached scalar metadata merged with reducer metadata.

Returns:: metadata – Scalar metadata declared by the reducer contract and the evaluation layer.
Return type:: dict

get_metrics() → Dict[str, Any][source]¶: Return cached scalar metrics from the latest score() call.

get_summary() → Dict[str, Any][source]¶

Return a normalized summary payload for report and export paths.

Returns:: Plain dictionary containing method identity, cached scalar summaries, reducer metadata, diagnostics, tidy metric records, and capability flags, plus cached feature interpretation payloads.
Return type:: dict

Notes

The summary does not include an embedding payload. Embeddings are handled explicitly outside the manager and must be passed directly to plotting or reporting utilities that need them.

save(path: str | pathlib.Path)[source]¶

Save the underlying reducer to disk.

Parameters:: path (str or Path) – Output path for reducer persistence.

Notes

Only the reducer model is persisted. Cached manager state such as metrics and diagnostics is not included.

classmethod load(path: str | pathlib.Path, method: str) → DimReduction[source]¶

Load a persisted reducer and wrap it in a fresh manager.

Parameters:

path (str or Path) – Path to a serialized reducer saved with save().
method (str) – Canonical public reducer name used to reconstruct the manager.

Returns:

Fresh manager wrapping the loaded reducer model.

Return type:

DimReduction

Notes

This restores the reducer model only. Cached manager state such as scores, diagnostics, and metric records is not persisted.