coco_pipe.dim_reduction.reducers¶
Submodules¶
Classes¶
Abstract base class for all dimensionality reduction implementations. |
|
Incremental PCA reducer. |
|
Principal Component Analysis reducer. |
|
Isometric Mapping reducer. |
|
Locally Linear Embedding reducer. |
|
Multidimensional Scaling reducer. |
|
Spectral Embedding reducer. |
|
t-SNE reducer. |
Package Contents¶
- class coco_pipe.dim_reduction.reducers.BaseReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
abc.ABCAbstract base class for all dimensionality reduction implementations.
This class defines the standard interface that all reducers must implement and is safe to subclass for custom reducers. It provides built-in support for model persistence (save/load) using joblib.
For custom reducers operating on nonstandard data layouts, override capabilities so the manager layer can route validation, scoring, plotting, and reporting correctly.
- Parameters:
n_components (int, default=2) – Target dimensionality of the reduced representation.
**kwargs (dict) – Additional keyword arguments stored on params and typically forwarded to the wrapped estimator or backend implementation.
- n_components¶
Target dimensionality of the reduced representation.
- Type:
int
- params¶
Additional reducer parameters captured at initialization time.
- Type:
dict
- model¶
Underlying fitted model object, such as a scikit-learn estimator or a scientific computing backend. This attribute should be populated by fit.
- Type:
Any
Notes
The capabilities property returns a plain dictionary consumed by the manager and evaluation layers. Custom reducers should declare supported diagnostics and scalar metadata explicitly through this mapping. Common keys include:
input_ndim : expected dimensionality of the input container
input_layout : semantic layout name such as “standard”
has_transform : whether transform is supported
has_inverse_transform : whether inverse transforms are available
has_components : whether PCA-like components are exposed
supported_diagnostics : names returned by get_diagnostics
has_native_plot : whether the reducer exposes its own plotting path
is_linear : whether the reducer is linear
is_stochastic : whether repeated runs can vary without a fixed seed
Examples
>>> from sklearn.decomposition import PCA >>> from coco_pipe.dim_reduction import BaseReducer >>> >>> class CustomPCAReducer(BaseReducer): ... @property ... def capabilities(self): ... return self._merge_capabilities( ... super().capabilities, ... is_linear=True, ... has_components=True, ... supported_diagnostics=("explained_variance_ratio_",), ... ) ... ... def fit(self, X, y=None): ... self.model = PCA(n_components=self.n_components, **self.params) ... self.model.fit(X) ... return self ... ... def transform(self, X): ... return self.model.transform(X)
- n_components = 2¶
- params¶
- model = None¶
- context_: Dict[str, Any]¶
- property name: str¶
Return a stable public display name for the reducer.
- _filter_params(fn_or_class: Any, params: dict) dict[source]¶
Filter parameters to match the signature of a function or class.
- Parameters:
fn_or_class (Any) – The function or class to inspect.
params (dict) – The parameters to filter.
- Returns:
filtered_params – Parameters present in the signature. If the target accepts
**kwargsor its signature cannot be inspected, the original parameter dictionary is returned unchanged.- Return type:
dict
Notes
This is a convenience helper for reducer implementations that wrap third-party estimators with partially overlapping constructor signatures.
- _build_estimator(estimator_cls: Any, params: dict | None = None, component_param: str | None = 'n_components', **fixed_kwargs: Any) Any[source]¶
Instantiate an estimator with filtered reducer parameters.
- Parameters:
estimator_cls (Any) – Estimator class to instantiate.
params (dict, optional) – Explicit parameter dictionary to filter instead of self.params.
component_param (str or None, default="n_components") – Name of the constructor argument receiving self.n_components. Set to
Noneto skip injecting the component count.**fixed_kwargs (dict) – Keyword arguments always forwarded to the estimator constructor.
- Returns:
Instantiated estimator.
- Return type:
Any
Notes
This helper assumes the wrapped backend is constructor-driven and can be configured from keyword arguments.
- _require_fitted(method_name: str = 'transform', model: Any = None) Any[source]¶
Validate that a reducer backend has been fitted before access.
- Parameters:
method_name (str, default="transform") – Operation requiring a fitted model.
model (Any, optional) – Backend model to check. Defaults to self.model.
- Returns:
The validated model instance.
- Return type:
Any
- Raises:
RuntimeError – If no fitted model is available.
- _merge_capabilities(base_caps: Dict[str, Any], **overrides: Any) Dict[str, Any][source]¶
Return a capability mapping updated with reducer-specific overrides.
- Parameters:
base_caps (dict) – Base capability mapping, typically super().capabilities.
**overrides (dict) – Reducer-specific capability values to apply.
- Returns:
Capability mapping with overrides applied.
- Return type:
dict
- abstract fit(X: ArrayLike, y: ArrayLike | None = None) BaseReducer[source]¶
Fit the model to the data.
- Parameters:
X (ArrayLike) – Training data. Most reducers expect (n_samples, n_features), but reducers with custom capabilities[“input_layout”] may accept other layouts such as snapshot matrices or grouped trajectory tensors.
y (ArrayLike, optional) – Optional supervision aligned with the sample axis used by the reducer’s declared input layout.
- Returns:
self – The fitted reducer instance.
- Return type:
Notes
Most reducers expect X to have shape (n_samples, n_features). Some reducers operate on alternative layouts and should document those layouts through capabilities.
- abstract transform(X: ArrayLike) numpy.ndarray[source]¶
Apply dimensionality reduction to X.
- Parameters:
X (ArrayLike) – New data to transform. Its layout should match the reducer’s declared capabilities.
- Returns:
X_new – Reduced representation. The exact output shape depends on the reducer, but the last dimension usually matches n_components.
- Return type:
np.ndarray
- Raises:
RuntimeError – Raised by concrete implementations when transform is called before fitting or when the reducer does not support out-of-sample transforms.
- fit_transform(X: ArrayLike, y: ArrayLike | None = None) numpy.ndarray[source]¶
Fit the model to data and return the transformed data.
This method usually calls fit and then transform, but reducers may override it for efficiency if the underlying algorithm supports a native combined path.
- Parameters:
X (ArrayLike) – Training data following the reducer’s declared layout.
y (ArrayLike, optional) – Optional supervision aligned with the reducer’s input layout.
- Returns:
X_new – Reduced representation returned by transform.
- Return type:
np.ndarray
- save(filepath: str | os.PathLike) None[source]¶
Persist the reducer to a file.
The default implementation serializes the reducer instance with joblib. Custom reducers should either remain joblib-serializable or override this method and load() with a custom persistence strategy.
- Parameters:
filepath (str or Path) – Path to the output file.
Notes
The default implementation serializes the reducer instance with joblib.dump. Custom reducers should either remain joblib-serializable or override this method and load with a custom persistence strategy.
- property capabilities: Dict[str, Any]¶
Return reducer capability flags consumed by the manager layer.
Custom reducers with nonstandard inputs should override at least input_ndim and input_layout. Reducers exposing diagnostics or scalar quality metadata should declare them explicitly through supported_diagnostics and supported_metadata.
- Returns:
Mapping of reducer capability flags.
- Return type:
dict
Notes
The default capabilities describe a typical estimator consuming (samples, features) input and exposing transform.
- _attribute_dict(obj: Any, attrs: Iterable[str]) Dict[str, Any][source]¶
Extract requested attributes from a target object into a dictionary.
This helper filters missing attributes and swallows common access errors (such as deferred scikit-learn properties) to return only what is currently available on the target.
- Parameters:
obj (Any) – Target object to inspect.
attrs (iterable of str) – Attribute names to attempt to extract.
- Returns:
Mapping of available attribute names to their values.
- Return type:
dict
- get_diagnostics() Dict[str, Any][source]¶
Return diagnostic arrays or structured artifacts.
Diagnostics are intended for non-scalar outputs such as explained variance curves, eigenvalues, modes, graphs, or training histories. Only names declared in capabilities[“supported_diagnostics”] are queried.
- Returns:
diagnostics – Dictionary of diagnostic attributes declared in capabilities[“supported_diagnostics”].
- Return type:
dict
- Raises:
RuntimeError – If the reducer has not been fitted.
- get_quality_metadata() Dict[str, Any][source]¶
Return scalar metadata about the reduction process or quality.
Typical examples include iteration counts, optimization stress, final loss values, or backend-specific convergence flags. Only names declared in capabilities[“supported_metadata”] are queried.
- Returns:
metadata – Dictionary containing only scalar values corresponding to keys declared in capabilities[“supported_metadata”].
- Return type:
dict
- Raises:
RuntimeError – If the reducer has not been fitted.
- get_components() numpy.ndarray[source]¶
Return reducer-defined component-like outputs.
- Returns:
Reducer-defined component array.
- Return type:
np.ndarray
- Raises:
ValueError – If the reducer does not expose public components.
- classmethod load(filepath: str | os.PathLike) BaseReducer[source]¶
Load a reducer from a file.
- Parameters:
filepath (str or Path) – Path to the file to load.
- Returns:
reducer – The loaded reducer instance.
- Return type:
Notes
This method assumes the reducer was serialized with save or a compatible joblib.dump call.
- class coco_pipe.dim_reduction.reducers.IncrementalPCAReducer(n_components: int = 2, batch_size: int | None = None, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerIncremental PCA reducer.
This reducer wraps sklearn.decomposition.IncrementalPCA for batch-wise fitting when the full dataset is too large to process in one pass.
- Parameters:
n_components (int, default=2) – Number of principal components to keep.
batch_size (int, optional) – Number of samples processed per batch.
**kwargs (dict) – Additional keyword arguments forwarded to IncrementalPCA after signature filtering.
- batch_size¶
Batch size used when fitting the incremental estimator.
- Type:
int or None
- model¶
Fitted IncrementalPCA estimator after fit or partial_fit.
- Type:
sklearn.decomposition.IncrementalPCA or None
See also
PCAReducerStandard in-memory linear PCA reducer.
DaskPCAReducerLinear PCA variant for lazy or distributed arrays.
DaskTruncatedSVDReducerLinear factorization alternative for lazy arrays.
IsomapReducerNonlinear manifold learner based on geodesic distances.
TSNEReducerNonlinear neighborhood-preserving embedding.
UMAPReducerNonlinear graph-based embedding balancing local and global structure.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IncrementalPCAReducer >>> X = np.random.rand(100, 12) >>> reducer = IncrementalPCAReducer(n_components=3, batch_size=25) >>> _ = reducer.fit(X) >>> reducer.transform(X[:10]).shape (10, 3) >>> stream = IncrementalPCAReducer(n_components=2, batch_size=20) >>> _ = stream.partial_fit(X[:50]) >>> _ = stream.partial_fit(X[50:]) >>> stream.transform(X).shape (100, 2)
- property capabilities: dict¶
Return capability metadata for Incremental PCA.
- Returns:
Capability mapping describing Incremental PCA as a linear component-based reducer.
- Return type:
dict
- batch_size = None¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) IncrementalPCAReducer[source]¶
Fit Incremental PCA in batch mode.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IncrementalPCAReducer >>> X = np.random.rand(30, 6) >>> reducer = IncrementalPCAReducer(n_components=2, batch_size=10) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- partial_fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) IncrementalPCAReducer[source]¶
Incrementally fit the estimator on a batch of samples.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Batch of training samples.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Reducer instance after updating the incremental estimator.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IncrementalPCAReducer >>> X = np.random.rand(40, 6) >>> reducer = IncrementalPCAReducer(n_components=2, batch_size=20) >>> _ = reducer.partial_fit(X[:20]) >>> _ = reducer.partial_fit(X[20:]) >>> reducer.model is not None True
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data onto the fitted incremental PCA basis.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Projected coordinates in component space.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.dim_reduction.reducers.PCAReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerPrincipal Component Analysis reducer.
This reducer wraps sklearn.decomposition.PCA and provides a linear low-dimensional embedding based on singular value decomposition.
- Parameters:
n_components (int, default=2) – Number of principal components to keep.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.decomposition.PCA after signature filtering. Common options include whiten, svd_solver, and random_state.
- model¶
Fitted PCA estimator after fit.
- Type:
sklearn.decomposition.PCA or None
Notes
This is a deterministic linear reducer unless a randomized solver is used.
See also
IncrementalPCAReducerLinear PCA variant for batch-wise fitting.
DaskPCAReducerLinear PCA variant for lazy or distributed arrays.
DaskTruncatedSVDReducerLinear factorization alternative for lazy arrays.
IsomapReducerNonlinear manifold learner based on geodesic distances.
TSNEReducerNonlinear neighborhood-preserving embedding.
UMAPReducerNonlinear graph-based embedding balancing local and global structure.
PHATEReducerNonlinear diffusion-based embedding for smooth trajectories.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import PCAReducer >>> X = np.random.rand(100, 10) >>> reducer = PCAReducer(n_components=2, random_state=42) >>> _ = reducer.fit(X) >>> X_reduced = reducer.transform(X) >>> X_reduced.shape (100, 2) >>> reducer.explained_variance_ratio_.shape (2,) >>> reducer.components_.shape (2, 10) >>> reducer = PCAReducer(n_components=3, whiten=True) >>> reducer.fit_transform(X).shape (100, 3)
- property capabilities: dict¶
Return capability metadata for PCA.
- Returns:
Capability mapping describing PCA as a linear component-based reducer.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) PCAReducer[source]¶
Fit PCA on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import PCAReducer >>> X = np.random.rand(20, 5) >>> reducer = PCAReducer(n_components=2) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data onto the fitted principal component basis.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Projected coordinates in principal component space.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property explained_variance_ratio_: numpy.ndarray¶
Percentage of variance explained by each selected component.
- Returns:
Explained variance ratio for each retained component.
- Return type:
np.ndarray of shape (n_components,)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property components_: numpy.ndarray¶
Principal axes in feature space.
- Returns:
Principal component loading matrix.
- Return type:
np.ndarray of shape (n_components, n_features)
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.dim_reduction.reducers.IsomapReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerIsometric Mapping reducer.
Isomap estimates geodesic distances on a nearest-neighbor graph and then computes a low-dimensional embedding consistent with those distances.
- Parameters:
n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.Isomap after signature filtering. Common options include n_neighbors, metric, p, and eigen_solver.
- model¶
Fitted Isomap estimator after fit.
- Type:
sklearn.manifold.Isomap or None
See also
LLEReducerNonlinear local-neighborhood manifold embedding.
MDSReducerDistance-preserving manifold embedding.
SpectralEmbeddingReducerNonlinear graph Laplacian embedding.
PCAReducerLinear baseline for global variance preservation.
UMAPReducerNonlinear graph-based embedding for local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IsomapReducer >>> X = np.random.rand(100, 10) >>> reducer = IsomapReducer(n_components=2, n_neighbors=5) >>> _ = reducer.fit(X) >>> reducer.transform(X[:8]).shape (8, 2) >>> reducer.n_features_in_ 10 >>> embedding = reducer.fit_transform(X) >>> embedding.shape (100, 2)
- property capabilities: dict¶
Return capability metadata for Isomap.
- Returns:
Capability mapping describing Isomap as a nonlinear reducer with out-of-sample transform support.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) IsomapReducer[source]¶
Fit Isomap on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import IsomapReducer >>> X = np.random.rand(30, 6) >>> reducer = IsomapReducer(n_components=2, n_neighbors=4) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data into the fitted Isomap embedding space.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Low-dimensional embedding coordinates.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property reconstruction_error_: float | None¶
Return the Isomap reconstruction error.
- Returns:
Reconstruction error returned by the fitted estimator.
- Return type:
float
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.dim_reduction.reducers.LLEReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerLocally Linear Embedding reducer.
LLE learns a nonlinear embedding by reconstructing each point from its local neighborhood in the input space and preserving those reconstruction weights in the low-dimensional space.
- Parameters:
n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.LocallyLinearEmbedding after signature filtering. Common options include n_neighbors, method, eigen_solver, and random_state.
- model¶
Fitted LLE estimator after fit.
- Type:
sklearn.manifold.LocallyLinearEmbedding or None
See also
IsomapReducerNonlinear geodesic-distance embedding.
MDSReducerDistance-preserving manifold embedding.
SpectralEmbeddingReducerNonlinear graph Laplacian embedding.
PCAReducerLinear baseline for global variance preservation.
UMAPReducerNonlinear graph-based embedding for local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import LLEReducer >>> X = np.random.rand(100, 10) >>> reducer = LLEReducer(n_components=2, n_neighbors=10, eigen_solver="dense") >>> _ = reducer.fit(X) >>> reducer.transform(X[:6]).shape (6, 2) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (100, 2)
- property capabilities: dict¶
Return capability metadata for LLE.
- Returns:
Capability mapping describing LLE as a nonlinear reducer with out-of-sample transform support.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) LLEReducer[source]¶
Fit LLE on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import LLEReducer >>> X = np.random.rand(30, 6) >>> reducer = LLEReducer(n_components=2, n_neighbors=5, eigen_solver="dense") >>> _ = reducer.fit(X) >>> reducer.model is not None True >>> reducer = LLEReducer(n_components=2, method="modified", n_neighbors=5) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data into the fitted LLE embedding space.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Low-dimensional embedding coordinates.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property reconstruction_error_: float¶
Return the LLE reconstruction error.
- Returns:
Reconstruction error associated with the embedding.
- Return type:
float
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.dim_reduction.reducers.MDSReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerMultidimensional Scaling reducer.
MDS seeks a low-dimensional representation whose pairwise distances best match the pairwise distances in the original space.
- Parameters:
n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.MDS after signature filtering. Common options include metric, n_init, max_iter, dissimilarity, and random_state.
- model¶
Fitted MDS estimator after fit or fit_transform.
- Type:
sklearn.manifold.MDS or None
Notes
transform is not supported because scikit-learn MDS does not provide an out-of-sample projection API.
See also
IsomapReducerNonlinear geodesic-distance embedding.
LLEReducerNonlinear local-neighborhood embedding.
SpectralEmbeddingReducerNonlinear graph Laplacian embedding.
PCAReducerLinear baseline for global variance preservation.
UMAPReducerNonlinear graph-based embedding for local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import MDSReducer >>> X = np.random.rand(60, 8) >>> reducer = MDSReducer(n_components=2, random_state=42) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (60, 2) >>> reducer.stress_ >= 0 True >>> _ = reducer.fit(X) >>> reducer.model is not None True
- property capabilities: dict¶
Return capability metadata for MDS.
- Returns:
Capability mapping describing MDS as a nonlinear reducer without out-of-sample transform support.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) MDSReducer[source]¶
Fit MDS on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import MDSReducer >>> X = np.random.rand(25, 5) >>> reducer = MDSReducer(n_components=2, random_state=0) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because scikit-learn MDS does not support out-of-sample transform.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because MDS does not support transforming new data.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit MDS and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by MDS.
- Return type:
np.ndarray of shape (n_samples, n_components)
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import MDSReducer >>> X = np.random.rand(20, 4) >>> reducer = MDSReducer(n_components=2, random_state=0) >>> reducer.fit_transform(X).shape (20, 2)
- property stress_: float¶
Return the MDS stress (sum of squared distances mismatch).
- Returns:
Stress value returned by the fitted MDS model.
- Return type:
float
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.dim_reduction.reducers.SpectralEmbeddingReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerSpectral Embedding reducer.
Spectral Embedding computes a nonlinear embedding using eigenvectors of the graph Laplacian built from the data affinity graph.
- Parameters:
n_components (int, default=2) – Number of coordinates for the manifold.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.SpectralEmbedding after signature filtering. Common options include affinity, gamma, random_state, eigen_solver, and n_neighbors.
- model¶
Fitted spectral embedding estimator after fit or fit_transform.
- Type:
sklearn.manifold.SpectralEmbedding or None
Notes
transform is not supported because scikit-learn SpectralEmbedding does not provide an out-of-sample projection API.
See also
IsomapReducerNonlinear geodesic-distance embedding.
LLEReducerNonlinear local-neighborhood embedding.
MDSReducerDistance-preserving manifold embedding.
PCAReducerLinear baseline for global variance preservation.
UMAPReducerNonlinear graph-based embedding for local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer >>> X = np.random.rand(80, 10) >>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=42) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (80, 2) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- property capabilities: dict¶
Return capability metadata for Spectral Embedding.
- Returns:
Capability mapping describing Spectral Embedding as a nonlinear reducer without out-of-sample transform support.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) SpectralEmbeddingReducer[source]¶
Fit Spectral Embedding on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer >>> X = np.random.rand(30, 6) >>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=0) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because scikit-learn Spectral Embedding lacks out-of-sample transform.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because Spectral Embedding does not support transforming new data.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit Spectral Embedding and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by Spectral Embedding.
- Return type:
np.ndarray of shape (n_samples, n_components)
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import SpectralEmbeddingReducer >>> X = np.random.rand(20, 4) >>> reducer = SpectralEmbeddingReducer(n_components=2, random_state=0) >>> reducer.fit_transform(X).shape (20, 2)
- class coco_pipe.dim_reduction.reducers.TSNEReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducert-SNE reducer.
t-Distributed Stochastic Neighbor Embedding (t-SNE) is a neighborhood- preserving method designed primarily for visualization. It optimizes a low-dimensional embedding by matching pairwise similarities between the original space and the embedding.
- Parameters:
n_components (int, default=2) – Number of embedding dimensions.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.TSNE after signature filtering. Common options include perplexity, learning_rate, max_iter, init, and random_state.
- embedding_¶
Learned training-set embedding after fit or fit_transform.
- Type:
np.ndarray or None
- model¶
Fitted t-SNE estimator after fit or fit_transform.
- Type:
sklearn.manifold.TSNE or None
Notes
transform is not supported because scikit-learn t-SNE does not provide an out-of-sample projection API.
See also
UMAPReducerNonlinear graph-based embedding with transform support.
PacmapReducerNonlinear embedding balancing local and global structure.
TrimapReducerNonlinear triplet-based embedding preserving global layout.
PHATEReducerDiffusion-based embedding for continuous trajectories.
PCAReducerLinear baseline for global variance preservation.
IsomapReducerNonlinear geodesic-distance manifold embedding.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import TSNEReducer >>> X = np.random.rand(100, 10) >>> reducer = TSNEReducer(n_components=2, perplexity=20, random_state=42) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (100, 2) >>> reducer.get_quality_metadata()["kl_divergence_"] >= 0 True >>> _ = reducer.fit(X) >>> reducer.embedding_.shape (100, 2)
- property capabilities: dict¶
Return capability metadata for t-SNE.
- Returns:
Capability mapping describing t-SNE as a nonlinear stochastic reducer without out-of-sample transform support.
- Return type:
dict
- embedding_ = None¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) TSNEReducer[source]¶
Fit t-SNE on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import TSNEReducer >>> X = np.random.rand(30, 6) >>> reducer = TSNEReducer(n_components=2, perplexity=5, max_iter=250) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because t-SNE does not support out-of-sample transformation.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because t-SNE does not support transforming new data.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit t-SNE and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by t-SNE.
- Return type:
np.ndarray of shape (n_samples, n_components)