coco_pipe.dim_reduction.evaluation.core¶
Pure evaluation orchestration for dimensionality-reduction workflows.
This module contains the two public evaluation interfaces used by the dim-reduction stack:
evaluate_embedding(...)evaluates an explicit embedding and returns scalar metrics, scalar metadata, diagnostics, and tidy metric records.MethodSelectorcompares and ranks multiple already-scoredDimReductionobjects without refitting or recomputing embeddings.
The module is intentionally evaluation-only. It does not fit reducers,
transform data, reconstruct 3D trajectory tensors from flat embeddings, or
provide plotting methods. Reduction execution belongs to
coco_pipe.dim_reduction.core.DimReduction and plotting belongs to
coco_pipe.viz.dim_reduction.
Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca)
Classes¶
Compare and rank already-scored dimensionality reduction methods. |
Functions¶
|
Evaluate an already computed embedding. |
Module Contents¶
- coco_pipe.dim_reduction.evaluation.core.evaluate_embedding(X_emb: numpy.ndarray, X: numpy.ndarray | None = None, method_name: str = 'embedding', metrics: Sequence[str] | None = None, labels: numpy.ndarray | None = None, groups: numpy.ndarray | None = None, times: numpy.ndarray | None = None, quality_metadata: Dict[str, Any] | None = None, diagnostics: Dict[str, Any] | None = None, random_state: int | None = None, n_neighbors: int = 5, k_values: Sequence[int] | None = None, separation_method: str = 'centroid') Dict[str, Any][source]¶
Evaluate an already computed embedding.
- Parameters:
X_emb (np.ndarray) –
Embedded data to evaluate.
(n_samples, n_components)triggers standard co-ranking and Shepard-style metrics.(n_trajectories, n_times, n_dims)triggers trajectory metrics.
X (np.ndarray, optional) – Original data with shape
(n_samples, n_features). Required when standard 2D metrics are requested.method_name (str, default="embedding") – Display name attached to tidy metric records.
metrics (sequence of str, optional) – Metric selectors to compute.
Nonecomputes all metrics available for the provided inputs.labels (np.ndarray, optional) – Optional labels aligned with the embedding. Used by
trajectory_separationfor native 3D embeddings and by explicit supervised 2D metrics such asseparation_logreg_balanced_accuracywhen requested.groups (np.ndarray, optional) – Optional grouping variable aligned with
X_emb. Required byseparation_logreg_balanced_accuracy.times (np.ndarray, optional) – Optional trajectory time coordinates used for separation AUC integration when trajectory metrics are evaluated.
quality_metadata (dict, optional) – Scalar quality metadata to attach to the evaluation payload.
diagnostics (dict, optional) – Precomputed diagnostics to carry through the evaluation payload.
random_state (int, optional) – Random state used for sampled Shepard distances.
n_neighbors (int, default=5) – Neighborhood size for single-score standard metrics.
k_values (sequence of int, optional) – Neighborhood sizes for benchmark sweeps.
separation_method (str, default="centroid") – Separation definition passed to
trajectory_separationwhen trajectory labels are available.
- Returns:
Dictionary with these keys:
embedding: the evaluated embeddingmetrics: scalar metric summariesmetadata: scalar descriptive metadatadiagnostics: array-like or structured diagnosticsrecords: tidy long-form metric records aslist[dict]artifacts: copy of the diagnostics payload
- Return type:
dict
- Raises:
TypeError – If
quality_metadataordiagnosticsis not a dictionary.ValueError – If
X_embis not 2D or 3D, or if standard 2D evaluation is requested without a compatibleX.
Notes
This function is intentionally pure. It does not fit reducers, transform data, or inspect reducer internals. Callers are responsible for preparing
X_emband any optional metadata such as trajectory labels or times.See also
coco_pipe.dim_reduction.core.DimReduction.scoreManager-level wrapper that prepares inputs and stores the returned evaluation payload on a fitted
DimReductionobject.MethodSelectorPost-hoc comparison and ranking across multiple scored reductions.
Examples
Evaluate a standard 2D embedding:
>>> import numpy as np >>> X = np.random.RandomState(0).randn(20, 5) >>> X_emb = X[:, :2] >>> result = evaluate_embedding(X_emb, X=X, method_name="demo") >>> "metrics" in result and "records" in result True
Evaluate a native trajectory embedding:
>>> traj = np.random.RandomState(0).randn(4, 10, 2) >>> labels = np.array(["A", "A", "B", "B"]) >>> result = evaluate_embedding( ... traj, ... method_name="traj", ... metrics=["trajectory_speed", "trajectory_separation"], ... labels=labels, ... ) >>> "trajectory_speed_mean" in result["metrics"] True
- class coco_pipe.dim_reduction.evaluation.core.MethodSelector(reducers: Dict[str, coco_pipe.dim_reduction.core.DimReduction] | List[coco_pipe.dim_reduction.core.DimReduction])[source]¶
Compare and rank already-scored dimensionality reduction methods.
MethodSelectoris intentionally post-hoc. It does not fit reducers or compute embeddings. Each reducer must already be a scoredDimReductioninstance with cachedmetric_records_.- Parameters:
reducers (dict or list of DimReduction) – Scored
DimReductionobjects to compare. Lists are converted to a method-keyed mapping usingreducer.method.
- reducers¶
Compared reductions keyed by method name.
- Type:
dict of str to DimReduction
- metric_records_¶
Cached long-form metric records populated by
collect().- Type:
list of dict
See also
evaluate_embeddingPure evaluator used upstream by
DimReduction.score.coco_pipe.dim_reduction.core.DimReduction.scoreScores a fitted reduction and populates the records consumed here.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(30, 4) >>> reducers = [ ... DimReduction("PCA", n_components=2), ... DimReduction("Isomap", n_components=2, n_neighbors=5), ... ] >>> for reducer in reducers: ... embedding = reducer.fit_transform(X) ... reducer.score(embedding, X=X, k_values=[5]) >>> selector = MethodSelector(reducers).collect() >>> frame = selector.to_frame() >>> not frame.empty True
- metric_records_ = []¶
- classmethod from_records(records: List[Dict[str, Any]]) MethodSelector[source]¶
Create a selector directly from long-form metric records.
- classmethod from_frame(frame: pandas.DataFrame) MethodSelector[source]¶
Create a selector directly from a metric-record DataFrame.
- collect() MethodSelector[source]¶
Collect cached metric records from already-scored reducers.
- Returns:
The selector populated with comparison-ready metric records.
- Return type:
- Raises:
ValueError – If a reducer has not been scored yet.
See also
coco_pipe.dim_reduction.core.DimReduction.scorePopulates the
metric_records_consumed by this method.to_frameMaterialize the collected long-form records as a DataFrame.
Notes
collect()does not fit reducers or recompute evaluation metrics. It only gathers cached metric observations from reducers that were already scored explicitly.Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> selector = MethodSelector([reducer]).collect() >>> len(selector.metric_records_) > 0 True
- to_frame() pandas.DataFrame[source]¶
Return the cached long-form metric table.
- Returns:
Tidy metric table with columns
method,metric,value,scope, andscope_value.- Return type:
pandas.DataFrame
Notes
This method only materializes a DataFrame at the public export boundary. Internally,
MethodSelectorstores metric records as plain Python dictionaries.See also
collectGather cached metric records from scored reducers.
rank_methodsRank reducers from the collected metric table.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> frame = MethodSelector([reducer]).collect().to_frame() >>> set(["method", "metric", "value"]).issubset(frame.columns) True
- rank_methods(selection_metric: str, *, selection_k: int | None = None, tie_breakers: Sequence[str] | None = None) pandas.DataFrame[source]¶
Rank methods using one primary metric and optional tie-breakers.
- Parameters:
selection_metric (str) – Metric to optimize.
selection_k (int, optional) – Neighborhood size to compare for k-scoped metrics.
tie_breakers (sequence of str, optional) – Additional metrics used in order when primary values tie.
- Returns:
Ranked comparison table. The first row is the best-scoring method under the requested ranking policy.
- Return type:
pandas.DataFrame
- Raises:
ValueError – If the requested metrics are unsupported, unavailable in the cached records, or missing the requested
selection_kobservations.
Notes
Ranking is based on mean metric values per method. For k-scoped metrics,
selection_krestricts comparison to a single neighborhood size when requested.See also
collectGather cached metric observations before ranking.
to_frameInspect the underlying long-form metric observations directly.
coco_pipe.dim_reduction.core.DimReduction.scoreProduces the metric records that feed into ranking.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import DimReduction >>> X = np.random.RandomState(0).randn(20, 4) >>> reducers = [DimReduction("PCA", n_components=2)] >>> reducer = reducers[0] >>> embedding = reducer.fit_transform(X) >>> reducer.score(embedding, X=X, k_values=[5]) >>> ranked = MethodSelector(reducers).collect().rank_methods( ... "trustworthiness", ... selection_k=5, ... ) >>> ranked.iloc[0]["method"] == reducer.method True