coco_pipe.dim_reduction.core¶
Execution manager for one dimensionality reduction method.
DimReduction is intentionally narrow. It owns reducer instantiation, input-shape validation for execution, fit/transform operations, and cached evaluation/interpretation state for one reducer instance. Plotting, trajectory reshaping, reporting, and multi-method comparison live in dedicated modules.
Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca)
Classes¶
Manage one dimensionality reduction workflow. |
Module Contents¶
- class coco_pipe.dim_reduction.core.DimReduction(method: str | coco_pipe.dim_reduction.config.BaseReducerConfig, n_components: int = 2, params: Dict[str, Any] | None = None, **kwargs)[source]¶
Manage one dimensionality reduction workflow.
- Parameters:
method (str or BaseReducerConfig) – Canonical public reducer name or a typed configuration object. Method names are exact and must match the registry, for example
"PCA","Isomap","Pacmap", or"TopologicalAE".n_components (int, default=2) – Target dimensionality when
methodis a string.params (dict, optional) – Additional reducer keyword arguments merged into the constructor arguments when
methodis a string.**kwargs (dict) – Runtime reducer keyword overrides. These are merged after
params.
- method¶
Canonical reducer name.
- Type:
str
- n_components¶
Target dimensionality used for the reducer instance.
- Type:
int
- reducer¶
Instantiated reducer backend.
- Type:
- metrics_¶
Cached scalar evaluation summaries from the latest
score()call.- Type:
dict
- quality_metadata_¶
Cached scalar reducer metadata exposed through the reducer contract.
- Type:
dict
- diagnostics_¶
Cached non-scalar diagnostic artifacts exposed through the reducer contract or the evaluation layer.
- Type:
dict
- metric_records_¶
Cached tidy metric observations produced by the evaluator.
- Type:
list of dict
- interpretation_¶
Cached feature interpretation payloads from the latest
interpret()call.- Type:
dict
- interpretation_records_¶
Cached tidy feature-interpretation observations.
- Type:
list of dict
See also
coco_pipe.dim_reduction.analysis.interpret_featuresPure interpretation backend used by
interpret().coco_pipe.dim_reduction.evaluation.core.evaluate_embeddingPure evaluator used by
score().coco_pipe.dim_reduction.evaluation.core.MethodSelectorPost-hoc comparison and ranking over already-scored reducers.
coco_pipe.viz.dim_reductionPlotting utilities for embeddings, metrics, and diagnostics.
Examples
>>> reducer = DimReduction("UMAP", n_components=2, n_neighbors=15) >>> embedding = reducer.fit_transform(X) >>> scores = reducer.score(embedding, X=X) >>> "trustworthiness" in scores["metrics"] True >>> interpretation = reducer.interpret( ... X, ... X_emb=embedding, ... analyses=["correlation"], ... feature_names=feature_names, ... ) >>> "correlation" in interpretation["analysis"] True
- reducer_kwargs¶
- metrics_: Dict[str, Any]¶
- quality_metadata_: Dict[str, Any]¶
- diagnostics_: Dict[str, Any]¶
- metric_records_: List[Dict[str, Any]] = []¶
- interpretation_: Dict[str, Any]¶
- interpretation_records_: List[Dict[str, Any]] = []¶
- property random_state: int | None¶
Return the random seed from parameters if any.
- property capabilities: Dict[str, Any]¶
Return reducer capability metadata through the manager interface.
- _validate_input(X: Any) numpy.ndarray[source]¶
Validate reducer input shape and coerce to a NumPy array.
- Parameters:
X (array-like or MNE object) – Input data accepted by the reducer. Objects exposing
get_data()are unwrapped before validation.- Returns:
X – Validated reducer input.
- Return type:
np.ndarray
- Raises:
ValueError – If the input dimensionality does not match the reducer contract.
- fit(X: Any, y: Any | None = None) DimReduction[source]¶
Fit the reducer on the provided data.
- Parameters:
X (array-like or MNE object) – Input data in the reducer’s native layout.
y (array-like, optional) – Optional supervision forwarded to the reducer.
- Returns:
self – The fitted reducer.
- Return type:
- transform(X: Any) numpy.ndarray[source]¶
Transform new data with a fitted reducer.
- Parameters:
X (array-like or MNE object) – Input data in the reducer’s native layout.
- Returns:
X_emb – Reduced representation returned by the reducer.
- Return type:
np.ndarray
- fit_transform(X: Any, y: Any | None = None) numpy.ndarray[source]¶
Fit the reducer and return the reduced representation.
- Parameters:
X (array-like or MNE object) – Input data in the reducer’s native layout.
y (array-like, optional) – Optional supervision forwarded to the reducer.
- Returns:
X_emb – Reduced representation returned by the reducer.
- Return type:
np.ndarray
- get_components() numpy.ndarray[source]¶
Return reducer-defined component-like outputs.
- Returns:
components – Component-like array exposed by the reducer.
- Return type:
np.ndarray
- Raises:
ValueError – If the reducer does not expose public components.
- score(X_emb: numpy.ndarray, X: Any = None, n_neighbors: int = 5, metrics: List[str] | None = None, k_values: List[int] | None = None, labels: numpy.ndarray | None = None, groups: numpy.ndarray | None = None, times: numpy.ndarray | None = None, separation_method: str = 'centroid') Dict[str, Dict[str, Any]][source]¶
Evaluate an explicit embedding against the original data.
- Parameters:
X_emb (array-like) – Embedded data to evaluate.
X (array-like, optional) – Original high-dimensional data in evaluation-ready layout. This is required for standard 2D metrics and optional for native 3D trajectory metrics.
n_neighbors (int, default=5) – K-nearest neighbors size for metric computation.
metrics (list of str, optional) – Metric selectors to compute.
Noneevaluates all metric families available for the embedding shape.k_values (list of int, optional) – Neighborhood sizes used for multi-scale standard metric evaluation.
labels (np.ndarray, optional) – Optional labels aligned with the embedding. Used for trajectory separation when
X_embis 3D and for explicit supervised 2D metrics when requested.groups (np.ndarray, optional) – Optional grouping variable aligned with the embedding. Required by grouped supervised evaluation metrics such as
separation_logreg_balanced_accuracy.times (np.ndarray, optional) – Optional trajectory time coordinates aligned with the trajectory length axis.
separation_method (str, default="centroid") – Separation definition passed to trajectory evaluation when labels are available for native 3D trajectory embeddings.
- Returns:
scores – Dictionary with keys
"metrics","metadata", and"diagnostics".- Return type:
dict
Notes
score()does not infer or cache embeddings. Callers must passX_embexplicitly.Xis only required when the requested evaluation path needs the original high-dimensional samples.
- interpret(X: numpy.ndarray, *, X_emb: numpy.ndarray, analyses: List[str] | None = None, feature_names: List[str] | None = None, n_repeats: int = 5, random_state: int | None = None) Dict[str, Any][source]¶
Run feature interpretation analyses for an explicit embedding.
- Parameters:
X (np.ndarray) – Original input data.
X_emb (np.ndarray) – Explicit embedding aligned with
X.analyses (list of {"correlation", "perturbation", "gradient"}, optional) – Interpretation analyses to compute.
Nonedefaults to["correlation"].feature_names (list of str, optional) – Feature names aligned with the columns of
Xwhen the requested interpretation returns feature-keyed outputs.n_repeats (int, default=5) – Number of shuffles per feature for perturbation importance.
random_state (int, optional) – Random seed for perturbation importance.
- Returns:
Dictionary with keys
"analysis"and"records".- Return type:
dict
Notes
interpret()does not fit the reducer or compute embeddings. Callers must pass bothXandX_embexplicitly.See also
coco_pipe.dim_reduction.analysis.interpret_featuresPure interpretation backend used by this manager method.
scoreEvaluate structure-preservation metrics for an explicit embedding.
Examples
>>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> result = reducer.interpret( ... X, ... X_emb=embedding, ... analyses=["correlation"], ... feature_names=feature_names, ... ) >>> sorted(result) ['analysis', 'records']
- get_diagnostics() Dict[str, Any][source]¶
Return cached diagnostics merged with reducer diagnostics.
- Returns:
diagnostics – Diagnostic artifacts declared by the reducer contract and the evaluation layer.
- Return type:
dict
- get_quality_metadata() Dict[str, Any][source]¶
Return cached scalar metadata merged with reducer metadata.
- Returns:
metadata – Scalar metadata declared by the reducer contract and the evaluation layer.
- Return type:
dict
- get_summary() Dict[str, Any][source]¶
Return a normalized summary payload for report and export paths.
- Returns:
Plain dictionary containing method identity, cached scalar summaries, reducer metadata, diagnostics, tidy metric records, and capability flags, plus cached feature interpretation payloads.
- Return type:
dict
Notes
The summary does not include an embedding payload. Embeddings are handled explicitly outside the manager and must be passed directly to plotting or reporting utilities that need them.
- save(path: str | pathlib.Path)[source]¶
Save the underlying reducer to disk.
- Parameters:
path (str or Path) – Output path for reducer persistence.
Notes
Only the reducer model is persisted. Cached manager state such as metrics and diagnostics is not included.
- classmethod load(path: str | pathlib.Path, method: str) DimReduction[source]¶
Load a persisted reducer and wrap it in a fresh manager.
- Parameters:
path (str or Path) – Path to a serialized reducer saved with
save().method (str) – Canonical public reducer name used to reconstruct the manager.
- Returns:
Fresh manager wrapping the loaded reducer model.
- Return type:
Notes
This restores the reducer model only. Cached manager state such as scores, diagnostics, and metric records is not persisted.