coco_pipe.dim_reduction.reducers¶

Submodules¶

Classes¶

`BaseReducer`	Abstract base class for all dimensionality reduction implementations.
`IncrementalPCAReducer`	Incremental PCA reducer.
`PCAReducer`	Principal Component Analysis reducer.
`IsomapReducer`	Isometric Mapping reducer.
`LLEReducer`	Locally Linear Embedding reducer.
`MDSReducer`	Multidimensional Scaling reducer.
`SpectralEmbeddingReducer`	Spectral Embedding reducer.
`TSNEReducer`	t-SNE reducer.

Package Contents¶

class coco_pipe.dim_reduction.reducers.BaseReducer(n_components: int = 2, **kwargs)[source]¶

Bases: abc.ABC

Abstract base class for all dimensionality reduction implementations.

This class defines the standard interface that all reducers must implement and is safe to subclass for custom reducers. It provides built-in support for model persistence (save/load) using joblib.

For custom reducers operating on nonstandard data layouts, override capabilities so the manager layer can route validation, scoring, plotting, and reporting correctly.

Parameters:

n_components (int, default=2) – Target dimensionality of the reduced representation.
**kwargs (dict) – Additional keyword arguments stored on params and typically forwarded to the wrapped estimator or backend implementation.

n_components¶

Target dimensionality of the reduced representation.

Type:: int

params¶

Additional reducer parameters captured at initialization time.

Type:: dict

model¶

Underlying fitted model object, such as a scikit-learn estimator or a scientific computing backend. This attribute should be populated by fit.

Type:: Any

Notes

The capabilities property returns a plain dictionary consumed by the manager and evaluation layers. Custom reducers should declare supported diagnostics and scalar metadata explicitly through this mapping. Common keys include:

input_ndim : expected dimensionality of the input container
input_layout : semantic layout name such as “standard”
has_transform : whether transform is supported
has_inverse_transform : whether inverse transforms are available
has_components : whether PCA-like components are exposed
supported_diagnostics : names returned by get_diagnostics
has_native_plot : whether the reducer exposes its own plotting path
is_linear : whether the reducer is linear
is_stochastic : whether repeated runs can vary without a fixed seed

Examples

>>> from sklearn.decomposition import PCA
>>> from coco_pipe.dim_reduction import BaseReducer
>>>
>>> class CustomPCAReducer(BaseReducer):
...     @property
...     def capabilities(self):
...         return self._merge_capabilities(
...             super().capabilities,
...             is_linear=True,
...             has_components=True,
...             supported_diagnostics=("explained_variance_ratio_",),
...         )
...
...     def fit(self, X, y=None):
...         self.model = PCA(n_components=self.n_components, **self.params)
...         self.model.fit(X)
...         return self
...
...     def transform(self, X):
...         return self.model.transform(X)

n_components = 2¶

params¶

model = None¶

context_: Dict[str, Any]¶

property name: str¶: Return a stable public display name for the reducer.

_filter_params(fn_or_class: Any, params: dict) → dict[source]¶

Filter parameters to match the signature of a function or class.

Parameters:

fn_or_class (Any) – The function or class to inspect.
params (dict) – The parameters to filter.

Returns:

filtered_params – Parameters present in the signature. If the target accepts **kwargs or its signature cannot be inspected, the original parameter dictionary is returned unchanged.

Return type:

dict

Notes

This is a convenience helper for reducer implementations that wrap third-party estimators with partially overlapping constructor signatures.

_build_estimator(estimator_cls: Any, params: dict | None = None, component_param: str | None = 'n_components', **fixed_kwargs: Any) → Any[source]¶

Instantiate an estimator with filtered reducer parameters.

Parameters:

estimator_cls (Any) – Estimator class to instantiate.
params (dict, optional) – Explicit parameter dictionary to filter instead of self.params.
component_param (str or None, default="n_components") – Name of the constructor argument receiving self.n_components. Set to None to skip injecting the component count.
**fixed_kwargs (dict) – Keyword arguments always forwarded to the estimator constructor.

Returns:

Instantiated estimator.

Return type:

Any

Notes

This helper assumes the wrapped backend is constructor-driven and can be configured from keyword arguments.

_require_fitted(method_name: str = 'transform', model: Any = None) → Any[source]¶

Validate that a reducer backend has been fitted before access.

Parameters:

method_name (str, default="transform") – Operation requiring a fitted model.
model (Any, optional) – Backend model to check. Defaults to self.model.

Returns:

The validated model instance.

Return type:

Any

Raises:

RuntimeError – If no fitted model is available.

_merge_capabilities(base_caps: Dict[str, Any], **overrides: Any) → Dict[str, Any][source]¶

Return a capability mapping updated with reducer-specific overrides.

Parameters:

base_caps (dict) – Base capability mapping, typically super().capabilities.
**overrides (dict) – Reducer-specific capability values to apply.

Returns:

Capability mapping with overrides applied.

Return type:

dict

abstract fit(X: ArrayLike, y: ArrayLike | None = None) → BaseReducer[source]¶

Fit the model to the data.

Parameters:

X (ArrayLike) – Training data. Most reducers expect (n_samples, n_features), but reducers with custom capabilities[“input_layout”] may accept other layouts such as snapshot matrices or grouped trajectory tensors.
y (ArrayLike, optional) – Optional supervision aligned with the sample axis used by the reducer’s declared input layout.

Returns:

self – The fitted reducer instance.

Return type:

BaseReducer

Notes

Most reducers expect X to have shape (n_samples, n_features). Some reducers operate on alternative layouts and should document those layouts through capabilities.

abstract transform(X: ArrayLike) → numpy.ndarray[source]¶

Apply dimensionality reduction to X.

Parameters:: X (ArrayLike) – New data to transform. Its layout should match the reducer’s declared capabilities.
Returns:: X_new – Reduced representation. The exact output shape depends on the reducer, but the last dimension usually matches n_components.
Return type:: np.ndarray
Raises:: RuntimeError – Raised by concrete implementations when transform is called before fitting or when the reducer does not support out-of-sample transforms.

fit_transform(X: ArrayLike, y: ArrayLike | None = None) → numpy.ndarray[source]¶

Fit the model to data and return the transformed data.

This method usually calls fit and then transform, but reducers may override it for efficiency if the underlying algorithm supports a native combined path.

Parameters:

X (ArrayLike) – Training data following the reducer’s declared layout.
y (ArrayLike, optional) – Optional supervision aligned with the reducer’s input layout.

Returns:

X_new – Reduced representation returned by transform.

Return type:

np.ndarray

save(filepath: str | os.PathLike) → None[source]¶

Persist the reducer to a file.

The default implementation serializes the reducer instance with joblib. Custom reducers should either remain joblib-serializable or override this method and load() with a custom persistence strategy.

Parameters:: filepath (str or Path) – Path to the output file.

Notes

The default implementation serializes the reducer instance with joblib.dump. Custom reducers should either remain joblib-serializable or override this method and load with a custom persistence strategy.

property capabilities: Dict[str, Any]¶

Return reducer capability flags consumed by the manager layer.

Custom reducers with nonstandard inputs should override at least input_ndim and input_layout. Reducers exposing diagnostics or scalar quality metadata should declare them explicitly through supported_diagnostics and supported_metadata.

Returns:: Mapping of reducer capability flags.
Return type:: dict

Notes

The default capabilities describe a typical estimator consuming (samples, features) input and exposing transform.

_attribute_dict(obj: Any, attrs: Iterable[str]) → Dict[str, Any][source]¶

Extract requested attributes from a target object into a dictionary.

This helper filters missing attributes and swallows common access errors (such as deferred scikit-learn properties) to return only what is currently available on the target.

Parameters:

obj (Any) – Target object to inspect.
attrs (iterable of str) – Attribute names to attempt to extract.

Returns:

Mapping of available attribute names to their values.

Return type:

dict

get_diagnostics() → Dict[str, Any][source]¶

Return diagnostic arrays or structured artifacts.

Diagnostics are intended for non-scalar outputs such as explained variance curves, eigenvalues, modes, graphs, or training histories. Only names declared in capabilities[“supported_diagnostics”] are queried.

Returns:: diagnostics – Dictionary of diagnostic attributes declared in capabilities[“supported_diagnostics”].
Return type:: dict
Raises:: RuntimeError – If the reducer has not been fitted.

get_quality_metadata() → Dict[str, Any][source]¶

Return scalar metadata about the reduction process or quality.

Typical examples include iteration counts, optimization stress, final loss values, or backend-specific convergence flags. Only names declared in capabilities[“supported_metadata”] are queried.

Returns:: metadata – Dictionary containing only scalar values corresponding to keys declared in capabilities[“supported_metadata”].
Return type:: dict
Raises:: RuntimeError – If the reducer has not been fitted.

get_components() → numpy.ndarray[source]¶

Return reducer-defined component-like outputs.

Returns:: Reducer-defined component array.
Return type:: np.ndarray
Raises:: ValueError – If the reducer does not expose public components.

classmethod load(filepath: str | os.PathLike) → BaseReducer[source]¶

Load a reducer from a file.

Parameters:: filepath (str or Path) – Path to the file to load.
Returns:: reducer – The loaded reducer instance.
Return type:: BaseReducer

Notes

This method assumes the reducer was serialized with save or a compatible joblib.dump call.

class coco_pipe.dim_reduction.reducers.IncrementalPCAReducer(n_components: int = 2, batch_size: int | None = None, **kwargs)[source]¶

Bases: coco_pipe.dim_reduction.reducers.base.BaseReducer

Incremental PCA reducer.

This reducer wraps sklearn.decomposition.IncrementalPCA for batch-wise fitting when the full dataset is too large to process in one pass.

Parameters:

n_components (int, default=2) – Number of principal components to keep.
batch_size (int, optional) – Number of samples processed per batch.
**kwargs (dict) – Additional keyword arguments forwarded to IncrementalPCA after signature filtering.

batch_size¶

Batch size used when fitting the incremental estimator.

Type:: int or None

model¶

Fitted IncrementalPCA estimator after fit or partial_fit.

Type:: sklearn.decomposition.IncrementalPCA or None

See also

PCAReducer: Standard in-memory linear PCA reducer.
DaskPCAReducer: Linear PCA variant for lazy or distributed arrays.
DaskTruncatedSVDReducer: Linear factorization alternative for lazy arrays.
IsomapReducer: Nonlinear manifold learner based on geodesic distances.
TSNEReducer: Nonlinear neighborhood-preserving embedding.
UMAPReducer: Nonlinear graph-based embedding balancing local and global structure.

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import IncrementalPCAReducer
>>> X = np.random.rand(100, 12)
>>> reducer = IncrementalPCAReducer(n_components=3, batch_size=25)
>>> _ = reducer.fit(X)
>>> reducer.transform(X[:10]).shape
(10, 3)
>>> stream = IncrementalPCAReducer(n_components=2, batch_size=20)
>>> _ = stream.partial_fit(X[:50])
>>> _ = stream.partial_fit(X[50:])
>>> stream.transform(X).shape
(100, 2)

property capabilities: dict¶

Return capability metadata for Incremental PCA.

Returns:: Capability mapping describing Incremental PCA as a linear component-based reducer.
Return type:: dict

batch_size = None¶

fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → IncrementalPCAReducer[source]¶

Fit Incremental PCA in batch mode.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Fitted reducer instance.

Return type:

IncrementalPCAReducer

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import IncrementalPCAReducer
>>> X = np.random.rand(30, 6)
>>> reducer = IncrementalPCAReducer(n_components=2, batch_size=10)
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

partial_fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → IncrementalPCAReducer[source]¶

Incrementally fit the estimator on a batch of samples.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Batch of training samples.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Reducer instance after updating the incremental estimator.

Return type:

IncrementalPCAReducer

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import IncrementalPCAReducer
>>> X = np.random.rand(40, 6)
>>> reducer = IncrementalPCAReducer(n_components=2, batch_size=20)
>>> _ = reducer.partial_fit(X[:20])
>>> _ = reducer.partial_fit(X[20:])
>>> reducer.model is not None
True

transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) → numpy.ndarray[source]¶

Project data onto the fitted incremental PCA basis.

Parameters:: X (ArrayLike of shape (n_samples, n_features)) – Data to project.
Returns:: Projected coordinates in component space.
Return type:: np.ndarray of shape (n_samples, n_components)
Raises:: RuntimeError – If the reducer has not been fitted.

get_components() → numpy.ndarray[source]¶

Return the incremental PCA component loading matrix.

Returns:: Principal component loading matrix.
Return type:: np.ndarray
Raises:: RuntimeError – If the reducer has not been fitted.

class coco_pipe.dim_reduction.reducers.PCAReducer(n_components: int = 2, **kwargs)[source]¶

Bases: coco_pipe.dim_reduction.reducers.base.BaseReducer

Principal Component Analysis reducer.

This reducer wraps sklearn.decomposition.PCA and provides a linear low-dimensional embedding based on singular value decomposition.

Parameters:

n_components (int, default=2) – Number of principal components to keep.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.decomposition.PCA after signature filtering. Common options include whiten, svd_solver, and random_state.

model¶

Fitted PCA estimator after fit.

Type:: sklearn.decomposition.PCA or None

Notes

This is a deterministic linear reducer unless a randomized solver is used.

See also

IncrementalPCAReducer: Linear PCA variant for batch-wise fitting.
DaskPCAReducer: Linear PCA variant for lazy or distributed arrays.
DaskTruncatedSVDReducer: Linear factorization alternative for lazy arrays.
IsomapReducer: Nonlinear manifold learner based on geodesic distances.
TSNEReducer: Nonlinear neighborhood-preserving embedding.
UMAPReducer: Nonlinear graph-based embedding balancing local and global structure.
PHATEReducer: Nonlinear diffusion-based embedding for smooth trajectories.

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import PCAReducer
>>> X = np.random.rand(100, 10)
>>> reducer = PCAReducer(n_components=2, random_state=42)
>>> _ = reducer.fit(X)
>>> X_reduced = reducer.transform(X)
>>> X_reduced.shape
(100, 2)
>>> reducer.explained_variance_ratio_.shape
(2,)
>>> reducer.components_.shape
(2, 10)
>>> reducer = PCAReducer(n_components=3, whiten=True)
>>> reducer.fit_transform(X).shape
(100, 3)

property capabilities: dict¶

Return capability metadata for PCA.

Returns:: Capability mapping describing PCA as a linear component-based reducer.
Return type:: dict

fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → PCAReducer[source]¶

Fit PCA on the input data.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Fitted reducer instance.

Return type:

PCAReducer

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import PCAReducer
>>> X = np.random.rand(20, 5)
>>> reducer = PCAReducer(n_components=2)
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) → numpy.ndarray[source]¶

Project data onto the fitted principal component basis.

Parameters:: X (ArrayLike of shape (n_samples, n_features)) – Data to project.
Returns:: Projected coordinates in principal component space.
Return type:: np.ndarray of shape (n_samples, n_components)
Raises:: RuntimeError – If the reducer has not been fitted.

property explained_variance_ratio_: numpy.ndarray¶

Percentage of variance explained by each selected component.

Returns:: Explained variance ratio for each retained component.
Return type:: np.ndarray of shape (n_components,)
Raises:: RuntimeError – If the reducer has not been fitted.

property components_: numpy.ndarray¶

Principal axes in feature space.

Returns:: Principal component loading matrix.
Return type:: np.ndarray of shape (n_components, n_features)
Raises:: RuntimeError – If the reducer has not been fitted.

get_components() → numpy.ndarray[source]¶

Return the principal component loading matrix.

Returns:: Principal component loading matrix.
Return type:: np.ndarray
Raises:: RuntimeError – If the reducer has not been fitted.

class coco_pipe.dim_reduction.reducers.IsomapReducer(n_components: int = 2, **kwargs)[source]¶

Bases: coco_pipe.dim_reduction.reducers.base.BaseReducer

Isometric Mapping reducer.

Isomap estimates geodesic distances on a nearest-neighbor graph and then computes a low-dimensional embedding consistent with those distances.

Parameters:

n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.Isomap after signature filtering. Common options include n_neighbors, metric, p, and eigen_solver.

model¶

Fitted Isomap estimator after fit.

Type:: sklearn.manifold.Isomap or None

See also

LLEReducer: Nonlinear local-neighborhood manifold embedding.
MDSReducer: Distance-preserving manifold embedding.
SpectralEmbeddingReducer: Nonlinear graph Laplacian embedding.
PCAReducer: Linear baseline for global variance preservation.
UMAPReducer: Nonlinear graph-based embedding for local and global structure.
TSNEReducer: Nonlinear neighborhood-preserving visualization method.

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import IsomapReducer
>>> X = np.random.rand(100, 10)
>>> reducer = IsomapReducer(n_components=2, n_neighbors=5)
>>> _ = reducer.fit(X)
>>> reducer.transform(X[:8]).shape
(8, 2)
>>> reducer.n_features_in_
10
>>> embedding = reducer.fit_transform(X)
>>> embedding.shape
(100, 2)

property capabilities: dict¶

Return capability metadata for Isomap.

Returns:: Capability mapping describing Isomap as a nonlinear reducer with out-of-sample transform support.
Return type:: dict

fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → IsomapReducer[source]¶

Fit Isomap on the input data.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Fitted reducer instance.

Return type:

IsomapReducer

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import IsomapReducer
>>> X = np.random.rand(30, 6)
>>> reducer = IsomapReducer(n_components=2, n_neighbors=4)
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) → numpy.ndarray[source]¶

Project data into the fitted Isomap embedding space.

Parameters:: X (ArrayLike of shape (n_samples, n_features)) – Data to project.
Returns:: Low-dimensional embedding coordinates.
Return type:: np.ndarray of shape (n_samples, n_components)
Raises:: RuntimeError – If the reducer has not been fitted.

property reconstruction_error_: float | None¶

Return the Isomap reconstruction error.

Returns:: Reconstruction error returned by the fitted estimator.
Return type:: float
Raises:: RuntimeError – If the reducer has not been fitted.

class coco_pipe.dim_reduction.reducers.LLEReducer(n_components: int = 2, **kwargs)[source]¶

Bases: coco_pipe.dim_reduction.reducers.base.BaseReducer

Locally Linear Embedding reducer.

LLE learns a nonlinear embedding by reconstructing each point from its local neighborhood in the input space and preserving those reconstruction weights in the low-dimensional space.

Parameters:

n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.LocallyLinearEmbedding after signature filtering. Common options include n_neighbors, method, eigen_solver, and random_state.

model¶

Fitted LLE estimator after fit.

Type:: sklearn.manifold.LocallyLinearEmbedding or None

See also

IsomapReducer: Nonlinear geodesic-distance embedding.
MDSReducer: Distance-preserving manifold embedding.
SpectralEmbeddingReducer: Nonlinear graph Laplacian embedding.
PCAReducer: Linear baseline for global variance preservation.
UMAPReducer: Nonlinear graph-based embedding for local and global structure.
TSNEReducer: Nonlinear neighborhood-preserving visualization method.

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import LLEReducer
>>> X = np.random.rand(100, 10)
>>> reducer = LLEReducer(n_components=2, n_neighbors=10, eigen_solver="dense")
>>> _ = reducer.fit(X)
>>> reducer.transform(X[:6]).shape
(6, 2)
>>> embedding = reducer.fit_transform(X)
>>> embedding.shape
(100, 2)

property capabilities: dict¶

Return capability metadata for LLE.

Returns:: Capability mapping describing LLE as a nonlinear reducer with out-of-sample transform support.
Return type:: dict

fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → LLEReducer[source]¶

Fit LLE on the input data.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Fitted reducer instance.

Return type:

LLEReducer

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import LLEReducer
>>> X = np.random.rand(30, 6)
>>> reducer = LLEReducer(n_components=2, n_neighbors=5, eigen_solver="dense")
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True
>>> reducer = LLEReducer(n_components=2, method="modified", n_neighbors=5)
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) → numpy.ndarray[source]¶

Project data into the fitted LLE embedding space.

Parameters:: X (ArrayLike of shape (n_samples, n_features)) – Data to project.
Returns:: Low-dimensional embedding coordinates.
Return type:: np.ndarray of shape (n_samples, n_components)
Raises:: RuntimeError – If the reducer has not been fitted.

property reconstruction_error_: float¶

Return the LLE reconstruction error.

Returns:: Reconstruction error associated with the embedding.
Return type:: float
Raises:: RuntimeError – If the reducer has not been fitted.

class coco_pipe.dim_reduction.reducers.MDSReducer(n_components: int = 2, **kwargs)[source]¶

Bases: coco_pipe.dim_reduction.reducers.base.BaseReducer

Multidimensional Scaling reducer.

MDS seeks a low-dimensional representation whose pairwise distances best match the pairwise distances in the original space.

Parameters:

n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.MDS after signature filtering. Common options include metric, n_init, max_iter, dissimilarity, and random_state.

model¶

Fitted MDS estimator after fit or fit_transform.

Type:: sklearn.manifold.MDS or None

Notes

transform is not supported because scikit-learn MDS does not provide an out-of-sample projection API.

See also

IsomapReducer: Nonlinear geodesic-distance embedding.
LLEReducer: Nonlinear local-neighborhood embedding.
SpectralEmbeddingReducer: Nonlinear graph Laplacian embedding.
PCAReducer: Linear baseline for global variance preservation.
UMAPReducer: Nonlinear graph-based embedding for local and global structure.
TSNEReducer: Nonlinear neighborhood-preserving visualization method.

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import MDSReducer
>>> X = np.random.rand(60, 8)
>>> reducer = MDSReducer(n_components=2, random_state=42)
>>> embedding = reducer.fit_transform(X)
>>> embedding.shape
(60, 2)
>>> reducer.stress_ >= 0
True
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

property capabilities: dict¶

Return capability metadata for MDS.

Returns:: Capability mapping describing MDS as a nonlinear reducer without out-of-sample transform support.
Return type:: dict

fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → MDSReducer[source]¶

Fit MDS on the input data.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Fitted reducer instance.

Return type:

MDSReducer

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import MDSReducer
>>> X = np.random.rand(25, 5)
>>> reducer = MDSReducer(n_components=2, random_state=0)
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) → numpy.ndarray[source]¶

Raise because scikit-learn MDS does not support out-of-sample transform.

Parameters:: X (ArrayLike) – Ignored input included for API compatibility.
Raises:: NotImplementedError – Always raised because MDS does not support transforming new data.

fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → numpy.ndarray[source]¶

Fit MDS and return the embedding coordinates.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Embedded coordinates produced by MDS.

Return type:

np.ndarray of shape (n_samples, n_components)

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import MDSReducer
>>> X = np.random.rand(20, 4)
>>> reducer = MDSReducer(n_components=2, random_state=0)
>>> reducer.fit_transform(X).shape
(20, 2)

property stress_: float¶

Return the MDS stress (sum of squared distances mismatch).

Returns:: Stress value returned by the fitted MDS model.
Return type:: float
Raises:: RuntimeError – If the reducer has not been fitted.

class coco_pipe.dim_reduction.reducers.SpectralEmbeddingReducer(n_components: int = 2, **kwargs)[source]¶

Bases: coco_pipe.dim_reduction.reducers.base.BaseReducer

Spectral Embedding reducer.

Spectral Embedding computes a nonlinear embedding using eigenvectors of the graph Laplacian built from the data affinity graph.

Parameters:

n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.SpectralEmbedding after signature filtering. Common options include affinity, gamma, random_state, eigen_solver, and n_neighbors.

model¶

Fitted spectral embedding estimator after fit or fit_transform.

Type:: sklearn.manifold.SpectralEmbedding or None

Notes

transform is not supported because scikit-learn SpectralEmbedding does not provide an out-of-sample projection API.

See also

IsomapReducer: Nonlinear geodesic-distance embedding.
LLEReducer: Nonlinear local-neighborhood embedding.
MDSReducer: Distance-preserving manifold embedding.
PCAReducer: Linear baseline for global variance preservation.
UMAPReducer: Nonlinear graph-based embedding for local and global structure.
TSNEReducer: Nonlinear neighborhood-preserving visualization method.

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer
>>> X = np.random.rand(80, 10)
>>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=42)
>>> embedding = reducer.fit_transform(X)
>>> embedding.shape
(80, 2)
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

property capabilities: dict¶

Return capability metadata for Spectral Embedding.

Returns:: Capability mapping describing Spectral Embedding as a nonlinear reducer without out-of-sample transform support.
Return type:: dict

fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → SpectralEmbeddingReducer[source]¶

Fit Spectral Embedding on the input data.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Fitted reducer instance.

Return type:

SpectralEmbeddingReducer

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer
>>> X = np.random.rand(30, 6)
>>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=0)
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) → numpy.ndarray[source]¶

Raise because scikit-learn Spectral Embedding lacks out-of-sample transform.

Parameters:: X (ArrayLike) – Ignored input included for API compatibility.
Raises:: NotImplementedError – Always raised because Spectral Embedding does not support transforming new data.

fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → numpy.ndarray[source]¶

Fit Spectral Embedding and return the embedding coordinates.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Embedded coordinates produced by Spectral Embedding.

Return type:

np.ndarray of shape (n_samples, n_components)

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer
>>> X = np.random.rand(20, 4)
>>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=0)
>>> reducer.fit_transform(X).shape
(20, 2)

class coco_pipe.dim_reduction.reducers.TSNEReducer(n_components: int = 2, **kwargs)[source]¶

Bases: coco_pipe.dim_reduction.reducers.base.BaseReducer

t-SNE reducer.

t-Distributed Stochastic Neighbor Embedding (t-SNE) is a neighborhood- preserving method designed primarily for visualization. It optimizes a low-dimensional embedding by matching pairwise similarities between the original space and the embedding.

Parameters:

n_components (int, default=2) – Number of embedding dimensions.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.TSNE after signature filtering. Common options include perplexity, learning_rate, max_iter, init, and random_state.

embedding_¶

Learned training-set embedding after fit or fit_transform.

Type:: np.ndarray or None

model¶

Fitted t-SNE estimator after fit or fit_transform.

Type:: sklearn.manifold.TSNE or None

Notes

transform is not supported because scikit-learn t-SNE does not provide an out-of-sample projection API.

See also

UMAPReducer: Nonlinear graph-based embedding with transform support.
PacmapReducer: Nonlinear embedding balancing local and global structure.
TrimapReducer: Nonlinear triplet-based embedding preserving global layout.
PHATEReducer: Diffusion-based embedding for continuous trajectories.
PCAReducer: Linear baseline for global variance preservation.
IsomapReducer: Nonlinear geodesic-distance manifold embedding.

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import TSNEReducer
>>> X = np.random.rand(100, 10)
>>> reducer = TSNEReducer(n_components=2, perplexity=20, random_state=42)
>>> embedding = reducer.fit_transform(X)
>>> embedding.shape
(100, 2)
>>> reducer.get_quality_metadata()["kl_divergence_"] >= 0
True
>>> _ = reducer.fit(X)
>>> reducer.embedding_.shape
(100, 2)

property capabilities: dict¶

Return capability metadata for t-SNE.

Returns:: Capability mapping describing t-SNE as a nonlinear stochastic reducer without out-of-sample transform support.
Return type:: dict

embedding_ = None¶

fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → TSNEReducer[source]¶

Fit t-SNE on the input data.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Fitted reducer instance.

Return type:

TSNEReducer

Examples

>>> import numpy as np
>>> from coco_pipe.dim_reduction import TSNEReducer
>>> X = np.random.rand(30, 6)
>>> reducer = TSNEReducer(n_components=2, perplexity=5, max_iter=250)
>>> _ = reducer.fit(X)
>>> reducer.model is not None
True

abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) → numpy.ndarray[source]¶

Raise because t-SNE does not support out-of-sample transformation.

Parameters:: X (ArrayLike) – Ignored input included for API compatibility.
Raises:: NotImplementedError – Always raised because t-SNE does not support transforming new data.

fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) → numpy.ndarray[source]¶

Fit t-SNE and return the embedding coordinates.

Parameters:

X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.

Returns:

Embedded coordinates produced by t-SNE.

Return type:

np.ndarray of shape (n_samples, n_components)