coco_pipe¶
Package initializer for the coco_pipe package.
Submodules¶
Attributes¶
Classes¶
Top-level descriptors configuration object. |
|
Run config-driven descriptor extraction on explicit arrays. |
|
Abstract base class for all dimensionality reduction implementations. |
|
Manage one dimensionality reduction workflow. |
|
Incremental PCA reducer. |
|
Isometric Mapping reducer. |
|
Locally Linear Embedding reducer. |
|
Multidimensional Scaling reducer. |
|
Principal Component Analysis reducer. |
|
Spectral Embedding reducer. |
|
t-SNE reducer. |
Functions¶
|
Compute continuity from a co-ranking matrix. |
|
Run one or more feature interpretation analyses. |
|
Compute the local continuity meta-criterion (LCMC). |
|
Compute sampled pairwise distances for a Shepard diagram. |
|
Compute trustworthiness from a co-ranking matrix. |
Package Contents¶
- class coco_pipe.DescriptorConfig(/, **data: Any)[source]¶
Bases:
_StrictConfigModelTop-level descriptors configuration object.
- input¶
Runtime input requirements for explicit array extraction.
- Type:
- families¶
Enabled descriptor families and their typed configs.
- Type:
- precision¶
Output dtype used for the final descriptor matrix.
- Type:
{“float32”, “float64”}
- runtime¶
Runtime execution and error-handling settings.
- Type:
Notes
This object is the stable config boundary for
coco_pipe.descriptors.core.DescriptorPipeline. Parsing this config validates local structure here, then the pipeline applies the remaining cross-family compatibility checks when it builds the execution plan.- input: DescriptorInputConfig = None¶
- families: DescriptorFamiliesConfig = None¶
- precision: Literal['float32', 'float64'] = 'float32'¶
- runtime: DescriptorRuntimeConfig = None¶
- class coco_pipe.DescriptorPipeline(config: coco_pipe.descriptors.configs.DescriptorConfig | collections.abc.Mapping[str, Any])[source]¶
Run config-driven descriptor extraction on explicit arrays.
- Parameters:
config (DescriptorConfig or Mapping[str, Any]) – Typed descriptors configuration or a mapping accepted by
DescriptorConfig.
- config¶
Parsed descriptors configuration.
- Type:
- extractors¶
Enabled family extractors in deterministic family order.
- Type:
list of BaseDescriptorExtractor
- signal_extractors¶
Enabled non-PSD extractors that consume raw signal batches directly.
- Type:
list of BaseDescriptorExtractor
- psd_groups¶
Planned PSD reuse groups derived once from the enabled extractors.
- Type:
list of _PSDGroup
- family_order¶
Deterministic family order used when merging batch-local outputs.
- Type:
list of str
Notes
The pipeline is config-bound but runtime-stateless. Construction performs config parsing, corrected-band compatibility checks, and planner setup once. Each call to
extract()then validates the explicit runtime inputs, executes the planned families, and returns one flat descriptor matrix plus any collected failures.- config¶
- extractors: list[coco_pipe.descriptors.extractors.base.BaseDescriptorExtractor] = []¶
- signal_extractors¶
- psd_groups = []¶
- family_order¶
- extract(X: numpy.ndarray, ids: collections.abc.Sequence[Any] | numpy.ndarray | None = None, sfreq: float | None = None, channel_names: collections.abc.Sequence[str] | numpy.ndarray | None = None) dict[str, Any][source]¶
Extract descriptors from explicit NumPy inputs.
- Parameters:
X (np.ndarray) – Signal array with shape
(n_obs, n_channels, n_times).ids (sequence or np.ndarray, optional) – Observation identifiers aligned with
X.sfreq (float, optional) – Sampling frequency in Hertz. Required when enabled families depend on spectral estimates or spectral entropy.
channel_names (sequence of str or np.ndarray, optional) – Channel labels. Required for channel-resolved outputs.
- Returns:
Dictionary with keys
X,descriptor_names, andfailures.- Return type:
dict[str, Any]
- Raises:
ValueError – If the explicit input contract is not satisfied.
ImportError – If an optional backend required by the enabled families is missing.
Notes
When
runtime.on_error="warn", extraction still completes and stores failures inresult["failures"]before emitting one aggregate warning at the pipeline level.The returned row order always matches the input observation order.
- pool_channels(result: collections.abc.Mapping[str, Any], channel_groups: collections.abc.Mapping[str, collections.abc.Sequence[str]]) dict[str, Any][source]¶
Pool sensor-level descriptor columns into grouped channel outputs.
- Parameters:
result (mapping) – Standard descriptor result produced by
extract().channel_groups (mapping of str to sequence of str) – Channel groups used to replace sensor-level descriptor columns with grouped
"chgrp-..."outputs.
- Returns:
Descriptor result with grouped channel features and unchanged failures.
- Return type:
dict[str, Any]
- Raises:
ValueError – If the provided result is malformed or if any requested group cannot be formed from the sensor-level descriptor columns.
- coco_pipe.METHODS = ('PCA', 'IncrementalPCA', 'DaskPCA', 'DaskTruncatedSVD', 'Isomap', 'LLE', 'MDS',...¶
- class coco_pipe.BaseReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
abc.ABCAbstract base class for all dimensionality reduction implementations.
This class defines the standard interface that all reducers must implement and is safe to subclass for custom reducers. It provides built-in support for model persistence (save/load) using joblib.
For custom reducers operating on nonstandard data layouts, override capabilities so the manager layer can route validation, scoring, plotting, and reporting correctly.
- Parameters:
n_components (int, default=2) – Target dimensionality of the reduced representation.
**kwargs (dict) – Additional keyword arguments stored on params and typically forwarded to the wrapped estimator or backend implementation.
- n_components¶
Target dimensionality of the reduced representation.
- Type:
int
- params¶
Additional reducer parameters captured at initialization time.
- Type:
dict
- model¶
Underlying fitted model object, such as a scikit-learn estimator or a scientific computing backend. This attribute should be populated by fit.
- Type:
Any
Notes
The capabilities property returns a plain dictionary consumed by the manager and evaluation layers. Custom reducers should declare supported diagnostics and scalar metadata explicitly through this mapping. Common keys include:
input_ndim : expected dimensionality of the input container
input_layout : semantic layout name such as “standard”
has_transform : whether transform is supported
has_inverse_transform : whether inverse transforms are available
has_components : whether PCA-like components are exposed
supported_diagnostics : names returned by get_diagnostics
has_native_plot : whether the reducer exposes its own plotting path
is_linear : whether the reducer is linear
is_stochastic : whether repeated runs can vary without a fixed seed
Examples
>>> from sklearn.decomposition import PCA >>> from coco_pipe.dim_reduction import BaseReducer >>> >>> class CustomPCAReducer(BaseReducer): ... @property ... def capabilities(self): ... return self._merge_capabilities( ... super().capabilities, ... is_linear=True, ... has_components=True, ... supported_diagnostics=("explained_variance_ratio_",), ... ) ... ... def fit(self, X, y=None): ... self.model = PCA(n_components=self.n_components, **self.params) ... self.model.fit(X) ... return self ... ... def transform(self, X): ... return self.model.transform(X)
- n_components = 2¶
- params¶
- model = None¶
- context_: Dict[str, Any]¶
- property name: str¶
Return a stable public display name for the reducer.
- _filter_params(fn_or_class: Any, params: dict) dict[source]¶
Filter parameters to match the signature of a function or class.
- Parameters:
fn_or_class (Any) – The function or class to inspect.
params (dict) – The parameters to filter.
- Returns:
filtered_params – Parameters present in the signature. If the target accepts
**kwargsor its signature cannot be inspected, the original parameter dictionary is returned unchanged.- Return type:
dict
Notes
This is a convenience helper for reducer implementations that wrap third-party estimators with partially overlapping constructor signatures.
- _build_estimator(estimator_cls: Any, params: dict | None = None, component_param: str | None = 'n_components', **fixed_kwargs: Any) Any[source]¶
Instantiate an estimator with filtered reducer parameters.
- Parameters:
estimator_cls (Any) – Estimator class to instantiate.
params (dict, optional) – Explicit parameter dictionary to filter instead of self.params.
component_param (str or None, default="n_components") – Name of the constructor argument receiving self.n_components. Set to
Noneto skip injecting the component count.**fixed_kwargs (dict) – Keyword arguments always forwarded to the estimator constructor.
- Returns:
Instantiated estimator.
- Return type:
Any
Notes
This helper assumes the wrapped backend is constructor-driven and can be configured from keyword arguments.
- _require_fitted(method_name: str = 'transform', model: Any = None) Any[source]¶
Validate that a reducer backend has been fitted before access.
- Parameters:
method_name (str, default="transform") – Operation requiring a fitted model.
model (Any, optional) – Backend model to check. Defaults to self.model.
- Returns:
The validated model instance.
- Return type:
Any
- Raises:
RuntimeError – If no fitted model is available.
- _merge_capabilities(base_caps: Dict[str, Any], **overrides: Any) Dict[str, Any][source]¶
Return a capability mapping updated with reducer-specific overrides.
- Parameters:
base_caps (dict) – Base capability mapping, typically super().capabilities.
**overrides (dict) – Reducer-specific capability values to apply.
- Returns:
Capability mapping with overrides applied.
- Return type:
dict
- abstract fit(X: ArrayLike, y: ArrayLike | None = None) BaseReducer[source]¶
Fit the model to the data.
- Parameters:
X (ArrayLike) – Training data. Most reducers expect (n_samples, n_features), but reducers with custom capabilities[“input_layout”] may accept other layouts such as snapshot matrices or grouped trajectory tensors.
y (ArrayLike, optional) – Optional supervision aligned with the sample axis used by the reducer’s declared input layout.
- Returns:
self – The fitted reducer instance.
- Return type:
Notes
Most reducers expect X to have shape (n_samples, n_features). Some reducers operate on alternative layouts and should document those layouts through capabilities.
- abstract transform(X: ArrayLike) numpy.ndarray[source]¶
Apply dimensionality reduction to X.
- Parameters:
X (ArrayLike) – New data to transform. Its layout should match the reducer’s declared capabilities.
- Returns:
X_new – Reduced representation. The exact output shape depends on the reducer, but the last dimension usually matches n_components.
- Return type:
np.ndarray
- Raises:
RuntimeError – Raised by concrete implementations when transform is called before fitting or when the reducer does not support out-of-sample transforms.
- fit_transform(X: ArrayLike, y: ArrayLike | None = None) numpy.ndarray[source]¶
Fit the model to data and return the transformed data.
This method usually calls fit and then transform, but reducers may override it for efficiency if the underlying algorithm supports a native combined path.
- Parameters:
X (ArrayLike) – Training data following the reducer’s declared layout.
y (ArrayLike, optional) – Optional supervision aligned with the reducer’s input layout.
- Returns:
X_new – Reduced representation returned by transform.
- Return type:
np.ndarray
- save(filepath: str | os.PathLike) None[source]¶
Persist the reducer to a file.
The default implementation serializes the reducer instance with joblib. Custom reducers should either remain joblib-serializable or override this method and load() with a custom persistence strategy.
- Parameters:
filepath (str or Path) – Path to the output file.
Notes
The default implementation serializes the reducer instance with joblib.dump. Custom reducers should either remain joblib-serializable or override this method and load with a custom persistence strategy.
- property capabilities: Dict[str, Any]¶
Return reducer capability flags consumed by the manager layer.
Custom reducers with nonstandard inputs should override at least input_ndim and input_layout. Reducers exposing diagnostics or scalar quality metadata should declare them explicitly through supported_diagnostics and supported_metadata.
- Returns:
Mapping of reducer capability flags.
- Return type:
dict
Notes
The default capabilities describe a typical estimator consuming (samples, features) input and exposing transform.
- _attribute_dict(obj: Any, attrs: Iterable[str]) Dict[str, Any][source]¶
Extract requested attributes from a target object into a dictionary.
This helper filters missing attributes and swallows common access errors (such as deferred scikit-learn properties) to return only what is currently available on the target.
- Parameters:
obj (Any) – Target object to inspect.
attrs (iterable of str) – Attribute names to attempt to extract.
- Returns:
Mapping of available attribute names to their values.
- Return type:
dict
- get_diagnostics() Dict[str, Any][source]¶
Return diagnostic arrays or structured artifacts.
Diagnostics are intended for non-scalar outputs such as explained variance curves, eigenvalues, modes, graphs, or training histories. Only names declared in capabilities[“supported_diagnostics”] are queried.
- Returns:
diagnostics – Dictionary of diagnostic attributes declared in capabilities[“supported_diagnostics”].
- Return type:
dict
- Raises:
RuntimeError – If the reducer has not been fitted.
- get_quality_metadata() Dict[str, Any][source]¶
Return scalar metadata about the reduction process or quality.
Typical examples include iteration counts, optimization stress, final loss values, or backend-specific convergence flags. Only names declared in capabilities[“supported_metadata”] are queried.
- Returns:
metadata – Dictionary containing only scalar values corresponding to keys declared in capabilities[“supported_metadata”].
- Return type:
dict
- Raises:
RuntimeError – If the reducer has not been fitted.
- get_components() numpy.ndarray[source]¶
Return reducer-defined component-like outputs.
- Returns:
Reducer-defined component array.
- Return type:
np.ndarray
- Raises:
ValueError – If the reducer does not expose public components.
- classmethod load(filepath: str | os.PathLike) BaseReducer[source]¶
Load a reducer from a file.
- Parameters:
filepath (str or Path) – Path to the file to load.
- Returns:
reducer – The loaded reducer instance.
- Return type:
Notes
This method assumes the reducer was serialized with save or a compatible joblib.dump call.
- class coco_pipe.DimReduction(method: str | coco_pipe.dim_reduction.config.BaseReducerConfig, n_components: int = 2, params: Dict[str, Any] | None = None, **kwargs)[source]¶
Manage one dimensionality reduction workflow.
- Parameters:
method (str or BaseReducerConfig) – Canonical public reducer name or a typed configuration object. Method names are exact and must match the registry, for example
"PCA","Isomap","Pacmap", or"TopologicalAE".n_components (int, default=2) – Target dimensionality when
methodis a string.params (dict, optional) – Additional reducer keyword arguments merged into the constructor arguments when
methodis a string.**kwargs (dict) – Runtime reducer keyword overrides. These are merged after
params.
- method¶
Canonical reducer name.
- Type:
str
- n_components¶
Target dimensionality used for the reducer instance.
- Type:
int
- reducer¶
Instantiated reducer backend.
- Type:
- metrics_¶
Cached scalar evaluation summaries from the latest
score()call.- Type:
dict
- quality_metadata_¶
Cached scalar reducer metadata exposed through the reducer contract.
- Type:
dict
- diagnostics_¶
Cached non-scalar diagnostic artifacts exposed through the reducer contract or the evaluation layer.
- Type:
dict
- metric_records_¶
Cached tidy metric observations produced by the evaluator.
- Type:
list of dict
- interpretation_¶
Cached feature interpretation payloads from the latest
interpret()call.- Type:
dict
- interpretation_records_¶
Cached tidy feature-interpretation observations.
- Type:
list of dict
See also
coco_pipe.dim_reduction.analysis.interpret_featuresPure interpretation backend used by
interpret().coco_pipe.dim_reduction.evaluation.core.evaluate_embeddingPure evaluator used by
score().coco_pipe.dim_reduction.evaluation.core.MethodSelectorPost-hoc comparison and ranking over already-scored reducers.
coco_pipe.viz.dim_reductionPlotting utilities for embeddings, metrics, and diagnostics.
Examples
>>> reducer = DimReduction("UMAP", n_components=2, n_neighbors=15) >>> embedding = reducer.fit_transform(X) >>> scores = reducer.score(embedding, X=X) >>> "trustworthiness" in scores["metrics"] True >>> interpretation = reducer.interpret( ... X, ... X_emb=embedding, ... analyses=["correlation"], ... feature_names=feature_names, ... ) >>> "correlation" in interpretation["analysis"] True
- reducer_kwargs¶
- metrics_: Dict[str, Any]¶
- quality_metadata_: Dict[str, Any]¶
- diagnostics_: Dict[str, Any]¶
- metric_records_: List[Dict[str, Any]] = []¶
- interpretation_: Dict[str, Any]¶
- interpretation_records_: List[Dict[str, Any]] = []¶
- property random_state: int | None¶
Return the random seed from parameters if any.
- property capabilities: Dict[str, Any]¶
Return reducer capability metadata through the manager interface.
- _validate_input(X: Any) numpy.ndarray[source]¶
Validate reducer input shape and coerce to a NumPy array.
- Parameters:
X (array-like or MNE object) – Input data accepted by the reducer. Objects exposing
get_data()are unwrapped before validation.- Returns:
X – Validated reducer input.
- Return type:
np.ndarray
- Raises:
ValueError – If the input dimensionality does not match the reducer contract.
- fit(X: Any, y: Any | None = None) DimReduction[source]¶
Fit the reducer on the provided data.
- Parameters:
X (array-like or MNE object) – Input data in the reducer’s native layout.
y (array-like, optional) – Optional supervision forwarded to the reducer.
- Returns:
self – The fitted reducer.
- Return type:
- transform(X: Any) numpy.ndarray[source]¶
Transform new data with a fitted reducer.
- Parameters:
X (array-like or MNE object) – Input data in the reducer’s native layout.
- Returns:
X_emb – Reduced representation returned by the reducer.
- Return type:
np.ndarray
- fit_transform(X: Any, y: Any | None = None) numpy.ndarray[source]¶
Fit the reducer and return the reduced representation.
- Parameters:
X (array-like or MNE object) – Input data in the reducer’s native layout.
y (array-like, optional) – Optional supervision forwarded to the reducer.
- Returns:
X_emb – Reduced representation returned by the reducer.
- Return type:
np.ndarray
- get_components() numpy.ndarray[source]¶
Return reducer-defined component-like outputs.
- Returns:
components – Component-like array exposed by the reducer.
- Return type:
np.ndarray
- Raises:
ValueError – If the reducer does not expose public components.
- score(X_emb: numpy.ndarray, X: Any = None, n_neighbors: int = 5, metrics: List[str] | None = None, k_values: List[int] | None = None, labels: numpy.ndarray | None = None, groups: numpy.ndarray | None = None, times: numpy.ndarray | None = None, separation_method: str = 'centroid') Dict[str, Dict[str, Any]][source]¶
Evaluate an explicit embedding against the original data.
- Parameters:
X_emb (array-like) – Embedded data to evaluate.
X (array-like, optional) – Original high-dimensional data in evaluation-ready layout. This is required for standard 2D metrics and optional for native 3D trajectory metrics.
n_neighbors (int, default=5) – K-nearest neighbors size for metric computation.
metrics (list of str, optional) – Metric selectors to compute.
Noneevaluates all metric families available for the embedding shape.k_values (list of int, optional) – Neighborhood sizes used for multi-scale standard metric evaluation.
labels (np.ndarray, optional) – Optional labels aligned with the embedding. Used for trajectory separation when
X_embis 3D and for explicit supervised 2D metrics when requested.groups (np.ndarray, optional) – Optional grouping variable aligned with the embedding. Required by grouped supervised evaluation metrics such as
separation_logreg_balanced_accuracy.times (np.ndarray, optional) – Optional trajectory time coordinates aligned with the trajectory length axis.
separation_method (str, default="centroid") – Separation definition passed to trajectory evaluation when labels are available for native 3D trajectory embeddings.
- Returns:
scores – Dictionary with keys
"metrics","metadata", and"diagnostics".- Return type:
dict
Notes
score()does not infer or cache embeddings. Callers must passX_embexplicitly.Xis only required when the requested evaluation path needs the original high-dimensional samples.
- interpret(X: numpy.ndarray, *, X_emb: numpy.ndarray, analyses: List[str] | None = None, feature_names: List[str] | None = None, n_repeats: int = 5, random_state: int | None = None) Dict[str, Any][source]¶
Run feature interpretation analyses for an explicit embedding.
- Parameters:
X (np.ndarray) – Original input data.
X_emb (np.ndarray) – Explicit embedding aligned with
X.analyses (list of {"correlation", "perturbation", "gradient"}, optional) – Interpretation analyses to compute.
Nonedefaults to["correlation"].feature_names (list of str, optional) – Feature names aligned with the columns of
Xwhen the requested interpretation returns feature-keyed outputs.n_repeats (int, default=5) – Number of shuffles per feature for perturbation importance.
random_state (int, optional) – Random seed for perturbation importance.
- Returns:
Dictionary with keys
"analysis"and"records".- Return type:
dict
Notes
interpret()does not fit the reducer or compute embeddings. Callers must pass bothXandX_embexplicitly.See also
coco_pipe.dim_reduction.analysis.interpret_featuresPure interpretation backend used by this manager method.
scoreEvaluate structure-preservation metrics for an explicit embedding.
Examples
>>> reducer = DimReduction("PCA", n_components=2) >>> embedding = reducer.fit_transform(X) >>> result = reducer.interpret( ... X, ... X_emb=embedding, ... analyses=["correlation"], ... feature_names=feature_names, ... ) >>> sorted(result) ['analysis', 'records']
- get_diagnostics() Dict[str, Any][source]¶
Return cached diagnostics merged with reducer diagnostics.
- Returns:
diagnostics – Diagnostic artifacts declared by the reducer contract and the evaluation layer.
- Return type:
dict
- get_quality_metadata() Dict[str, Any][source]¶
Return cached scalar metadata merged with reducer metadata.
- Returns:
metadata – Scalar metadata declared by the reducer contract and the evaluation layer.
- Return type:
dict
- get_summary() Dict[str, Any][source]¶
Return a normalized summary payload for report and export paths.
- Returns:
Plain dictionary containing method identity, cached scalar summaries, reducer metadata, diagnostics, tidy metric records, and capability flags, plus cached feature interpretation payloads.
- Return type:
dict
Notes
The summary does not include an embedding payload. Embeddings are handled explicitly outside the manager and must be passed directly to plotting or reporting utilities that need them.
- save(path: str | pathlib.Path)[source]¶
Save the underlying reducer to disk.
- Parameters:
path (str or Path) – Output path for reducer persistence.
Notes
Only the reducer model is persisted. Cached manager state such as metrics and diagnostics is not included.
- classmethod load(path: str | pathlib.Path, method: str) DimReduction[source]¶
Load a persisted reducer and wrap it in a fresh manager.
- Parameters:
path (str or Path) – Path to a serialized reducer saved with
save().method (str) – Canonical public reducer name used to reconstruct the manager.
- Returns:
Fresh manager wrapping the loaded reducer model.
- Return type:
Notes
This restores the reducer model only. Cached manager state such as scores, diagnostics, and metric records is not persisted.
- class coco_pipe.IncrementalPCAReducer(n_components: int = 2, batch_size: int | None = None, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerIncremental PCA reducer.
This reducer wraps sklearn.decomposition.IncrementalPCA for batch-wise fitting when the full dataset is too large to process in one pass.
- Parameters:
n_components (int, default=2) – Number of principal components to keep.
batch_size (int, optional) – Number of samples processed per batch.
**kwargs (dict) – Additional keyword arguments forwarded to IncrementalPCA after signature filtering.
- batch_size¶
Batch size used when fitting the incremental estimator.
- Type:
int or None
- model¶
Fitted IncrementalPCA estimator after fit or partial_fit.
- Type:
sklearn.decomposition.IncrementalPCA or None
See also
PCAReducerStandard in-memory linear PCA reducer.
DaskPCAReducerLinear PCA variant for lazy or distributed arrays.
DaskTruncatedSVDReducerLinear factorization alternative for lazy arrays.
IsomapReducerNonlinear manifold learner based on geodesic distances.
TSNEReducerNonlinear neighborhood-preserving embedding.
UMAPReducerNonlinear graph-based embedding balancing local and global structure.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IncrementalPCAReducer >>> X = np.random.rand(100, 12) >>> reducer = IncrementalPCAReducer(n_components=3, batch_size=25) >>> _ = reducer.fit(X) >>> reducer.transform(X[:10]).shape (10, 3) >>> stream = IncrementalPCAReducer(n_components=2, batch_size=20) >>> _ = stream.partial_fit(X[:50]) >>> _ = stream.partial_fit(X[50:]) >>> stream.transform(X).shape (100, 2)
- property capabilities: dict¶
Return capability metadata for Incremental PCA.
- Returns:
Capability mapping describing Incremental PCA as a linear component-based reducer.
- Return type:
dict
- batch_size = None¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) IncrementalPCAReducer[source]¶
Fit Incremental PCA in batch mode.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IncrementalPCAReducer >>> X = np.random.rand(30, 6) >>> reducer = IncrementalPCAReducer(n_components=2, batch_size=10) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- partial_fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) IncrementalPCAReducer[source]¶
Incrementally fit the estimator on a batch of samples.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Batch of training samples.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Reducer instance after updating the incremental estimator.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IncrementalPCAReducer >>> X = np.random.rand(40, 6) >>> reducer = IncrementalPCAReducer(n_components=2, batch_size=20) >>> _ = reducer.partial_fit(X[:20]) >>> _ = reducer.partial_fit(X[20:]) >>> reducer.model is not None True
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data onto the fitted incremental PCA basis.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Projected coordinates in component space.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.IsomapReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerIsometric Mapping reducer.
Isomap estimates geodesic distances on a nearest-neighbor graph and then computes a low-dimensional embedding consistent with those distances.
- Parameters:
n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.Isomap after signature filtering. Common options include n_neighbors, metric, p, and eigen_solver.
- model¶
Fitted Isomap estimator after fit.
- Type:
sklearn.manifold.Isomap or None
See also
LLEReducerNonlinear local-neighborhood manifold embedding.
MDSReducerDistance-preserving manifold embedding.
SpectralEmbeddingReducerNonlinear graph Laplacian embedding.
PCAReducerLinear baseline for global variance preservation.
UMAPReducerNonlinear graph-based embedding for local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IsomapReducer >>> X = np.random.rand(100, 10) >>> reducer = IsomapReducer(n_components=2, n_neighbors=5) >>> _ = reducer.fit(X) >>> reducer.transform(X[:8]).shape (8, 2) >>> reducer.n_features_in_ 10 >>> embedding = reducer.fit_transform(X) >>> embedding.shape (100, 2)
- property capabilities: dict¶
Return capability metadata for Isomap.
- Returns:
Capability mapping describing Isomap as a nonlinear reducer with out-of-sample transform support.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) IsomapReducer[source]¶
Fit Isomap on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IsomapReducer >>> X = np.random.rand(30, 6) >>> reducer = IsomapReducer(n_components=2, n_neighbors=4) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data into the fitted Isomap embedding space.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Low-dimensional embedding coordinates.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property reconstruction_error_: float | None¶
Return the Isomap reconstruction error.
- Returns:
Reconstruction error returned by the fitted estimator.
- Return type:
float
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.LLEReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerLocally Linear Embedding reducer.
LLE learns a nonlinear embedding by reconstructing each point from its local neighborhood in the input space and preserving those reconstruction weights in the low-dimensional space.
- Parameters:
n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.LocallyLinearEmbedding after signature filtering. Common options include n_neighbors, method, eigen_solver, and random_state.
- model¶
Fitted LLE estimator after fit.
- Type:
sklearn.manifold.LocallyLinearEmbedding or None
See also
IsomapReducerNonlinear geodesic-distance embedding.
MDSReducerDistance-preserving manifold embedding.
SpectralEmbeddingReducerNonlinear graph Laplacian embedding.
PCAReducerLinear baseline for global variance preservation.
UMAPReducerNonlinear graph-based embedding for local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import LLEReducer >>> X = np.random.rand(100, 10) >>> reducer = LLEReducer(n_components=2, n_neighbors=10, eigen_solver="dense") >>> _ = reducer.fit(X) >>> reducer.transform(X[:6]).shape (6, 2) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (100, 2)
- property capabilities: dict¶
Return capability metadata for LLE.
- Returns:
Capability mapping describing LLE as a nonlinear reducer with out-of-sample transform support.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) LLEReducer[source]¶
Fit LLE on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import LLEReducer >>> X = np.random.rand(30, 6) >>> reducer = LLEReducer(n_components=2, n_neighbors=5, eigen_solver="dense") >>> _ = reducer.fit(X) >>> reducer.model is not None True >>> reducer = LLEReducer(n_components=2, method="modified", n_neighbors=5) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data into the fitted LLE embedding space.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Low-dimensional embedding coordinates.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property reconstruction_error_: float¶
Return the LLE reconstruction error.
- Returns:
Reconstruction error associated with the embedding.
- Return type:
float
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.MDSReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerMultidimensional Scaling reducer.
MDS seeks a low-dimensional representation whose pairwise distances best match the pairwise distances in the original space.
- Parameters:
n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.MDS after signature filtering. Common options include metric, n_init, max_iter, dissimilarity, and random_state.
- model¶
Fitted MDS estimator after fit or fit_transform.
- Type:
sklearn.manifold.MDS or None
Notes
transform is not supported because scikit-learn MDS does not provide an out-of-sample projection API.
See also
IsomapReducerNonlinear geodesic-distance embedding.
LLEReducerNonlinear local-neighborhood embedding.
SpectralEmbeddingReducerNonlinear graph Laplacian embedding.
PCAReducerLinear baseline for global variance preservation.
UMAPReducerNonlinear graph-based embedding for local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import MDSReducer >>> X = np.random.rand(60, 8) >>> reducer = MDSReducer(n_components=2, random_state=42) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (60, 2) >>> reducer.stress_ >= 0 True >>> _ = reducer.fit(X) >>> reducer.model is not None True
- property capabilities: dict¶
Return capability metadata for MDS.
- Returns:
Capability mapping describing MDS as a nonlinear reducer without out-of-sample transform support.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) MDSReducer[source]¶
Fit MDS on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import MDSReducer >>> X = np.random.rand(25, 5) >>> reducer = MDSReducer(n_components=2, random_state=0) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because scikit-learn MDS does not support out-of-sample transform.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because MDS does not support transforming new data.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit MDS and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by MDS.
- Return type:
np.ndarray of shape (n_samples, n_components)
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import MDSReducer >>> X = np.random.rand(20, 4) >>> reducer = MDSReducer(n_components=2, random_state=0) >>> reducer.fit_transform(X).shape (20, 2)
- property stress_: float¶
Return the MDS stress (sum of squared distances mismatch).
- Returns:
Stress value returned by the fitted MDS model.
- Return type:
float
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.PCAReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerPrincipal Component Analysis reducer.
This reducer wraps sklearn.decomposition.PCA and provides a linear low-dimensional embedding based on singular value decomposition.
- Parameters:
n_components (int, default=2) – Number of principal components to keep.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.decomposition.PCA after signature filtering. Common options include whiten, svd_solver, and random_state.
- model¶
Fitted PCA estimator after fit.
- Type:
sklearn.decomposition.PCA or None
Notes
This is a deterministic linear reducer unless a randomized solver is used.
See also
IncrementalPCAReducerLinear PCA variant for batch-wise fitting.
DaskPCAReducerLinear PCA variant for lazy or distributed arrays.
DaskTruncatedSVDReducerLinear factorization alternative for lazy arrays.
IsomapReducerNonlinear manifold learner based on geodesic distances.
TSNEReducerNonlinear neighborhood-preserving embedding.
UMAPReducerNonlinear graph-based embedding balancing local and global structure.
PHATEReducerNonlinear diffusion-based embedding for smooth trajectories.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import PCAReducer >>> X = np.random.rand(100, 10) >>> reducer = PCAReducer(n_components=2, random_state=42) >>> _ = reducer.fit(X) >>> X_reduced = reducer.transform(X) >>> X_reduced.shape (100, 2) >>> reducer.explained_variance_ratio_.shape (2,) >>> reducer.components_.shape (2, 10) >>> reducer = PCAReducer(n_components=3, whiten=True) >>> reducer.fit_transform(X).shape (100, 3)
- property capabilities: dict¶
Return capability metadata for PCA.
- Returns:
Capability mapping describing PCA as a linear component-based reducer.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) PCAReducer[source]¶
Fit PCA on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import PCAReducer >>> X = np.random.rand(20, 5) >>> reducer = PCAReducer(n_components=2) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data onto the fitted principal component basis.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Projected coordinates in principal component space.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property explained_variance_ratio_: numpy.ndarray¶
Percentage of variance explained by each selected component.
- Returns:
Explained variance ratio for each retained component.
- Return type:
np.ndarray of shape (n_components,)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property components_: numpy.ndarray¶
Principal axes in feature space.
- Returns:
Principal component loading matrix.
- Return type:
np.ndarray of shape (n_components, n_features)
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.SpectralEmbeddingReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerSpectral Embedding reducer.
Spectral Embedding computes a nonlinear embedding using eigenvectors of the graph Laplacian built from the data affinity graph.
- Parameters:
n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.SpectralEmbedding after signature filtering. Common options include affinity, gamma, random_state, eigen_solver, and n_neighbors.
- model¶
Fitted spectral embedding estimator after fit or fit_transform.
- Type:
sklearn.manifold.SpectralEmbedding or None
Notes
transform is not supported because scikit-learn SpectralEmbedding does not provide an out-of-sample projection API.
See also
IsomapReducerNonlinear geodesic-distance embedding.
LLEReducerNonlinear local-neighborhood embedding.
MDSReducerDistance-preserving manifold embedding.
PCAReducerLinear baseline for global variance preservation.
UMAPReducerNonlinear graph-based embedding for local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer >>> X = np.random.rand(80, 10) >>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=42) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (80, 2) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- property capabilities: dict¶
Return capability metadata for Spectral Embedding.
- Returns:
Capability mapping describing Spectral Embedding as a nonlinear reducer without out-of-sample transform support.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) SpectralEmbeddingReducer[source]¶
Fit Spectral Embedding on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer >>> X = np.random.rand(30, 6) >>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=0) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because scikit-learn Spectral Embedding lacks out-of-sample transform.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because Spectral Embedding does not support transforming new data.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit Spectral Embedding and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by Spectral Embedding.
- Return type:
np.ndarray of shape (n_samples, n_components)
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer >>> X = np.random.rand(20, 4) >>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=0) >>> reducer.fit_transform(X).shape (20, 2)
- class coco_pipe.TSNEReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducert-SNE reducer.
t-Distributed Stochastic Neighbor Embedding (t-SNE) is a neighborhood- preserving method designed primarily for visualization. It optimizes a low-dimensional embedding by matching pairwise similarities between the original space and the embedding.
- Parameters:
n_components (int, default=2) – Number of embedding dimensions.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.TSNE after signature filtering. Common options include perplexity, learning_rate, max_iter, init, and random_state.
- embedding_¶
Learned training-set embedding after fit or fit_transform.
- Type:
np.ndarray or None
- model¶
Fitted t-SNE estimator after fit or fit_transform.
- Type:
sklearn.manifold.TSNE or None
Notes
transform is not supported because scikit-learn t-SNE does not provide an out-of-sample projection API.
See also
UMAPReducerNonlinear graph-based embedding with transform support.
PacmapReducerNonlinear embedding balancing local and global structure.
TrimapReducerNonlinear triplet-based embedding preserving global layout.
PHATEReducerDiffusion-based embedding for continuous trajectories.
PCAReducerLinear baseline for global variance preservation.
IsomapReducerNonlinear geodesic-distance manifold embedding.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import TSNEReducer >>> X = np.random.rand(100, 10) >>> reducer = TSNEReducer(n_components=2, perplexity=20, random_state=42) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (100, 2) >>> reducer.get_quality_metadata()["kl_divergence_"] >= 0 True >>> _ = reducer.fit(X) >>> reducer.embedding_.shape (100, 2)
- property capabilities: dict¶
Return capability metadata for t-SNE.
- Returns:
Capability mapping describing t-SNE as a nonlinear stochastic reducer without out-of-sample transform support.
- Return type:
dict
- embedding_ = None¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) TSNEReducer[source]¶
Fit t-SNE on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import TSNEReducer >>> X = np.random.rand(30, 6) >>> reducer = TSNEReducer(n_components=2, perplexity=5, max_iter=250) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because t-SNE does not support out-of-sample transformation.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because t-SNE does not support transforming new data.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit t-SNE and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by t-SNE.
- Return type:
np.ndarray of shape (n_samples, n_components)
- coco_pipe.continuity(Q: numpy.ndarray, k: int) float[source]¶
Compute continuity from a co-ranking matrix.
Continuity penalizes extrusions, i.e. points that are among the
knearest neighbors in the original space but are pushed farther away in the embedding.- Parameters:
Q (np.ndarray of shape (n_samples - 1, n_samples - 1)) – Co-ranking matrix.
k (int) – Neighborhood size. The normalization used by continuity requires
2 * n_samples - 3 * k - 1 > 0.
- Returns:
Continuity score in
[0, 1]. Higher is better.- Return type:
float
- Raises:
ValueError – If
Qis invalid or ifkfalls outside the valid domain.
See also
trustworthinessComplementary intrusion-based metric.
compute_coranking_matrixConstruct the required co-ranking matrix.
Examples
>>> import numpy as np >>> Q = np.diag([1, 1, 1, 1]) >>> continuity(Q, k=1) 1.0
- coco_pipe.interpret_features(X: numpy.ndarray, *, X_emb: numpy.ndarray | None = None, model: Any | None = None, analyses: Sequence[str] | None = None, feature_names: Sequence[str] | None = None, method_name: str = 'embedding', n_repeats: int = 5, random_state: int | None = None) Dict[str, Any][source]¶
Run one or more feature interpretation analyses.
- Parameters:
X (np.ndarray) – Original input data.
X_emb (np.ndarray, optional) – Explicit embedding used by correlation-based analysis.
model (Any, optional) – Fitted reducer or model used by importance analyses.
analyses (sequence of {"correlation", "perturbation", "gradient"}, optional) – Analyses to compute.
Nonedefaults to("correlation",).feature_names (sequence of str, optional) – Feature names aligned with
Xwhen the requested analysis returns feature-keyed outputs.method_name (str, default="embedding") – Display name written into the returned analysis records.
n_repeats (int, default=5) – Number of permutations per feature for perturbation importance.
random_state (int, optional) – Random seed for perturbation importance.
- Returns:
Dictionary with keys:
analysis: nested analysis payloadsrecords: tidy analysis records aslist[dict]
- Return type:
dict
- Raises:
ValueError – If a requested analysis is unsupported, missing required inputs, or lacks required feature names.
Notes
This function is a pure interpretation backend for manager, report, or visualization workflows. It does not fit models, compute embeddings, or mutate reducer state.
See also
correlate_featuresFeature-to-dimension interpretation from explicit embeddings.
perturbation_importanceModel-agnostic importance based on shuffled features.
gradient_importanceEncoder saliency for supported torch-based reducers.
Examples
>>> import numpy as np >>> class MockReducer: ... def transform(self, X): ... return X[:, :2] >>> X = np.array([[0.0, 1.0], [1.0, 0.0], [2.0, 1.0]]) >>> X_emb = X[:, :2] >>> result = interpret_features( ... X, ... X_emb=X_emb, ... model=MockReducer(), ... analyses=["correlation", "perturbation"], ... feature_names=["f1", "f2"], ... n_repeats=1, ... random_state=0, ... ) >>> sorted(result) ['analysis', 'records']
- coco_pipe.lcmc(Q: numpy.ndarray, k: int) float[source]¶
Compute the local continuity meta-criterion (LCMC).
- Parameters:
Q (np.ndarray of shape (n_samples - 1, n_samples - 1)) – Co-ranking matrix.
k (int) – Neighborhood size.
- Returns:
LCMC score. Higher is better.
- Return type:
float
- Raises:
ValueError – If
Qis invalid or ifkfalls outside the valid domain.
See also
trustworthinessNeighbor-preservation metric.
continuityNeighbor-consistency metric.
Examples
>>> import numpy as np >>> Q = np.diag([1, 1, 1, 1]) >>> isinstance(lcmc(Q, k=1), float) True
- coco_pipe.shepard_diagram_data(X: numpy.ndarray, X_embedded: numpy.ndarray, sample_size: int = 1000, random_state: int | None = None) Tuple[numpy.ndarray, numpy.ndarray][source]¶
Compute sampled pairwise distances for a Shepard diagram.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Original high-dimensional data.
X_embedded (np.ndarray of shape (n_samples, n_components)) – Low-dimensional embedding of the same samples.
sample_size (int, default=1000) – Number of samples to keep before computing pairwise distances. If
sample_sizeis at leastn_samples, all samples are used.random_state (int, optional) – Random seed used when subsampling.
- Returns:
Pairwise distances in the original and embedded spaces.
- Return type:
tuple[np.ndarray, np.ndarray]
- Raises:
ValueError – If the inputs are invalid or if
sample_size <= 1.
See also
compute_coranking_matrixRank-based global quality summary.
Examples
>>> import numpy as np >>> X = np.random.RandomState(0).rand(10, 3) >>> X_emb = X[:, :2] >>> d_orig, d_emb = shepard_diagram_data(X, X_emb, sample_size=5, random_state=0) >>> len(d_orig) == len(d_emb) True
- coco_pipe.trustworthiness(Q: numpy.ndarray, k: int) float[source]¶
Compute trustworthiness from a co-ranking matrix.
Trustworthiness penalizes intrusions, i.e. points that appear among the
knearest neighbors in the embedding but were farther away in the original space.- Parameters:
Q (np.ndarray of shape (n_samples - 1, n_samples - 1)) – Co-ranking matrix.
k (int) – Neighborhood size. The normalization used by trustworthiness requires
2 * n_samples - 3 * k - 1 > 0.
- Returns:
Trustworthiness score in
[0, 1]. Higher is better.- Return type:
float
- Raises:
ValueError – If
Qis invalid or ifkfalls outside the valid domain.
See also
continuityComplementary extrusion-based metric.
compute_coranking_matrixConstruct the required co-ranking matrix.
Examples
>>> import numpy as np >>> Q = np.diag([1, 1, 1, 1]) >>> trustworthiness(Q, k=1) 1.0