coco_pipe.dim_reduction.reducers.neighbor¶
Neighbor-embedding and graph-based reducers.
This module provides wrappers for neighborhood-preserving and graph-based nonlinear dimensionality reduction methods, including t-SNE, UMAP, PaCMAP, TriMap, PHATE, and Parametric UMAP.
Classes¶
- TSNEReducer
t-Distributed Stochastic Neighbor Embedding wrapper.
- UMAPReducer
Uniform Manifold Approximation and Projection wrapper.
- PacmapReducer
Pairwise Controlled Manifold Approximation wrapper.
- TrimapReducer
Triplet-based manifold embedding wrapper.
- PHATEReducer
Diffusion-based PHATE embedding wrapper.
- ParametricUMAPReducer
Neural-network-backed Parametric UMAP wrapper.
References
- Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca)
Sina Esmaeili (sina.esmaeili@umontreal.ca)
Classes¶
t-SNE reducer. |
|
UMAP reducer. |
|
PaCMAP reducer. |
|
TriMap reducer. |
|
PHATE reducer. |
|
Parametric UMAP reducer. |
Module Contents¶
- class coco_pipe.dim_reduction.reducers.neighbor.TSNEReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducert-SNE reducer.
t-Distributed Stochastic Neighbor Embedding (t-SNE) is a neighborhood- preserving method designed primarily for visualization. It optimizes a low-dimensional embedding by matching pairwise similarities between the original space and the embedding.
- Parameters:
n_components (int, default=2) – Number of embedding dimensions.
**kwargs (dict) – Additional keyword arguments forwarded to sklearn.manifold.TSNE after signature filtering. Common options include perplexity, learning_rate, max_iter, init, and random_state.
- embedding_¶
Learned training-set embedding after fit or fit_transform.
- Type:
np.ndarray or None
- model¶
Fitted t-SNE estimator after fit or fit_transform.
- Type:
sklearn.manifold.TSNE or None
Notes
transform is not supported because scikit-learn t-SNE does not provide an out-of-sample projection API.
See also
UMAPReducerNonlinear graph-based embedding with transform support.
PacmapReducerNonlinear embedding balancing local and global structure.
TrimapReducerNonlinear triplet-based embedding preserving global layout.
PHATEReducerDiffusion-based embedding for continuous trajectories.
PCAReducerLinear baseline for global variance preservation.
IsomapReducerNonlinear geodesic-distance manifold embedding.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import TSNEReducer >>> X = np.random.rand(100, 10) >>> reducer = TSNEReducer(n_components=2, perplexity=20, random_state=42) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (100, 2) >>> reducer.get_quality_metadata()["kl_divergence_"] >= 0 True >>> _ = reducer.fit(X) >>> reducer.embedding_.shape (100, 2)
- property capabilities: dict¶
Return capability metadata for t-SNE.
- Returns:
Capability mapping describing t-SNE as a nonlinear stochastic reducer without out-of-sample transform support.
- Return type:
dict
- embedding_ = None¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) TSNEReducer[source]¶
Fit t-SNE on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import TSNEReducer >>> X = np.random.rand(30, 6) >>> reducer = TSNEReducer(n_components=2, perplexity=5, max_iter=250) >>> _ = reducer.fit(X) >>> reducer.model is not None True
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because t-SNE does not support out-of-sample transformation.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because t-SNE does not support transforming new data.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit t-SNE and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by t-SNE.
- Return type:
np.ndarray of shape (n_samples, n_components)
- class coco_pipe.dim_reduction.reducers.neighbor.UMAPReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerUMAP reducer.
Uniform Manifold Approximation and Projection (UMAP) constructs a graph in the high-dimensional space and optimizes a low-dimensional representation of that graph. Unlike t-SNE, UMAP supports out-of-sample transformation.
- Parameters:
n_components (int, default=2) – Number of embedding dimensions.
**kwargs (dict) – Additional keyword arguments forwarded to umap.UMAP after signature filtering. Common options include n_neighbors, min_dist, metric, and random_state.
- model¶
Fitted UMAP estimator after fit.
- Type:
umap.UMAP or None
See also
TSNEReducerNonlinear neighborhood-preserving visualization method.
PacmapReducerNonlinear embedding balancing local and global structure.
TrimapReducerNonlinear triplet-based embedding preserving global layout.
PHATEReducerDiffusion-based embedding for continuous trajectories.
IsomapReducerNonlinear geodesic-distance manifold embedding.
PCAReducerLinear baseline for global variance preservation.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import UMAPReducer >>> X = np.random.rand(100, 10) >>> reducer = UMAPReducer(n_components=2, n_neighbors=10, random_state=42) >>> _ = reducer.fit(X) >>> reducer.transform(X[:10]).shape (10, 2) >>> reducer.get_diagnostics()["graph_"] is not None True >>> reducer.fit_transform(X).shape (100, 2)
- property capabilities: dict¶
Return capability metadata for UMAP.
- Returns:
Capability mapping describing UMAP as a nonlinear stochastic reducer with transform support and a native plotting path.
- Return type:
dict
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) UMAPReducer[source]¶
Fit UMAP on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Optional supervision supported by UMAP.
- Returns:
Fitted reducer instance.
- Return type:
- Raises:
ImportError – If umap-learn is not installed.
RuntimeError – If umap-learn is installed but fails during initialization.
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data using the fitted UMAP model.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Low-dimensional embedding coordinates.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.dim_reduction.reducers.neighbor.PacmapReducer(n_components: int = 2, n_neighbors: int = 10, MN_ratio: float = 0.5, FP_ratio: float = 2.0, nn_backend: str = 'faiss', init: str = 'pca', **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerPaCMAP reducer.
Pairwise Controlled Manifold Approximation (PaCMAP) preserves local and global structure by balancing near, mid-near, and far pairs during the optimization.
- Parameters:
n_components (int, default=2) – Number of embedding dimensions.
n_neighbors (int, default=10) – Number of neighbors used to form local pairs.
MN_ratio (float, default=0.5) – Ratio of mid-near pairs.
FP_ratio (float, default=2.0) – Ratio of far pairs.
nn_backend ({"faiss", "annoy", "voyager"}, default="faiss") – Nearest-neighbor backend used by recent PaCMAP versions. Older PaCMAP releases that do not expose this argument will ignore it through signature filtering.
init (str, default="pca") – Initialization strategy passed to fit_transform.
**kwargs (dict) – Additional keyword arguments forwarded to pacmap.PaCMAP after signature filtering.
- embedding_¶
Learned training-set embedding after fit or fit_transform.
- Type:
np.ndarray or None
- model¶
Fitted PaCMAP estimator after fit or fit_transform.
- Type:
pacmap.PaCMAP or None
Notes
transform is not supported because PaCMAP does not provide an efficient out-of-sample projection API.
See also
UMAPReducerNonlinear graph-based embedding with transform support.
TrimapReducerNonlinear triplet-based embedding preserving global layout.
TSNEReducerNonlinear neighborhood-preserving visualization method.
PHATEReducerDiffusion-based embedding for continuous trajectories.
PCAReducerLinear baseline for global variance preservation.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import PacmapReducer >>> X = np.random.rand(100, 10) >>> reducer = PacmapReducer( ... n_components=2, ... n_neighbors=10, ... nn_backend="faiss", ... init="random", ... ) >>> embedding = reducer.fit_transform(X) >>> embedding.shape (100, 2) >>> reducer.embedding_.shape (100, 2)
- property capabilities: dict¶
Return capability metadata for PaCMAP.
- Returns:
Capability mapping describing PaCMAP as a nonlinear stochastic reducer without out-of-sample transform support.
- Return type:
dict
- n_neighbors = 10¶
- MN_ratio = 0.5¶
- FP_ratio = 2.0¶
- nn_backend = 'faiss'¶
- init = 'pca'¶
- embedding_ = None¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) PacmapReducer[source]¶
Fit PaCMAP on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
- Raises:
ImportError – If pacmap is not installed.
RuntimeError – If pacmap is installed but fails during initialization.
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because PaCMAP does not support out-of-sample transformation.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because PaCMAP does not support transforming new data without refitting.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit PaCMAP and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by PaCMAP.
- Return type:
np.ndarray of shape (n_samples, n_components)
- class coco_pipe.dim_reduction.reducers.neighbor.TrimapReducer(n_components: int = 2, n_inliers: int = 10, n_outliers: int = 5, n_random: int = 5, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerTriMap reducer.
TriMap uses triplet constraints to preserve relative similarities while emphasizing global layout preservation.
- Parameters:
n_components (int, default=2) – Number of embedding dimensions.
n_inliers (int, default=10) – Number of nearest-neighbor inlier triplets.
n_outliers (int, default=5) – Number of outlier triplets.
n_random (int, default=5) – Number of random triplets per sample.
**kwargs (dict) – Additional keyword arguments forwarded to trimap.TRIMAP after signature filtering.
- embedding_¶
Learned training-set embedding after fit or fit_transform.
- Type:
np.ndarray or None
- model¶
Fitted TriMap estimator after fit or fit_transform.
- Type:
trimap.TRIMAP or None
Notes
transform is not supported because TriMap does not provide an out-of-sample projection API.
See also
UMAPReducerNonlinear graph-based embedding with transform support.
PacmapReducerNonlinear embedding balancing local and global structure.
TSNEReducerNonlinear neighborhood-preserving visualization method.
PHATEReducerDiffusion-based embedding for continuous trajectories.
IsomapReducerNonlinear geodesic-distance manifold embedding.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import TrimapReducer >>> X = np.random.rand(100, 10) >>> reducer = TrimapReducer(n_components=2) >>> reducer.fit_transform(X).shape (100, 2)
- property capabilities: dict¶
Return capability metadata for TriMap.
- Returns:
Capability mapping describing TriMap as a nonlinear stochastic reducer without out-of-sample transform support.
- Return type:
dict
- n_inliers = 10¶
- n_outliers = 5¶
- n_random = 5¶
- embedding_ = None¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) TrimapReducer[source]¶
Fit TriMap on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
- Raises:
ImportError – If trimap is not installed.
RuntimeError – If trimap is installed but fails during initialization.
- abstract transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Raise because TriMap does not support out-of-sample transformation.
- Parameters:
X (ArrayLike) – Ignored input included for API compatibility.
- Raises:
NotImplementedError – Always raised because TriMap does not support transforming new data without refitting.
- fit_transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) numpy.ndarray[source]¶
Fit TriMap and return the embedding coordinates.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Embedded coordinates produced by TriMap.
- Return type:
np.ndarray of shape (n_samples, n_components)
- class coco_pipe.dim_reduction.reducers.neighbor.PHATEReducer(n_components: int = 2, knn: int = 5, decay: int = 40, t: Any = 'auto', **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerPHATE reducer.
Potential of Heat-diffusion for Affinity-based Transition Embedding (PHATE) is designed for data with continuous progression structure and uses diffusion-based distances to construct the embedding.
- Parameters:
n_components (int, default=2) – Number of embedding dimensions.
knn (int, default=5) – Number of nearest neighbors used in the kernel graph.
decay (int, default=40) – Decay rate for the kernel.
t (int or str, default="auto") – Diffusion time.
**kwargs (dict) – Additional keyword arguments forwarded to phate.PHATE after signature filtering.
- model¶
Fitted PHATE estimator after fit.
- Type:
phate.PHATE or None
See also
UMAPReducerNonlinear graph-based embedding with transform support.
TSNEReducerNonlinear neighborhood-preserving visualization method.
PacmapReducerNonlinear embedding balancing local and global structure.
TrimapReducerNonlinear triplet-based embedding preserving global layout.
ParametricUMAPReducerNeural-network-backed UMAP approximation.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import PHATEReducer >>> X = np.random.rand(100, 10) >>> reducer = PHATEReducer(n_components=2, knn=5) >>> _ = reducer.fit(X) >>> reducer.transform(X[:10]).shape (10, 2) >>> reducer.get_diagnostics()["diff_potential"] is not None True
- property capabilities: dict¶
Return capability metadata for PHATE.
- Returns:
Capability mapping describing PHATE as a nonlinear reducer with transform support and a native plotting path.
- Return type:
dict
- knn = 5¶
- decay = 40¶
- t = 'auto'¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) PHATEReducer[source]¶
Fit PHATE on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Ignored. Present for API compatibility.
- Returns:
Fitted reducer instance.
- Return type:
- Raises:
ImportError – If phate is not installed.
RuntimeError – If phate is installed but fails during initialization.
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data using the fitted PHATE model.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Low-dimensional embedding coordinates.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- class coco_pipe.dim_reduction.reducers.neighbor.ParametricUMAPReducer(n_components: int = 2, n_neighbors: int = 15, min_dist: float = 0.1, metric: str = 'euclidean', n_epochs: int | None = None, batch_size: int = 1000, verbose: bool = False, **kwargs)[source]¶
Bases:
coco_pipe.dim_reduction.reducers.base.BaseReducerParametric UMAP reducer.
Parametric UMAP learns a neural network that approximates the UMAP embedding, enabling reusable out-of-sample projection through the trained network.
- Parameters:
n_components (int, default=2) – Number of embedding dimensions.
n_neighbors (int, default=15) – Size of the local neighborhood.
min_dist (float, default=0.1) – Effective minimum distance between embedded points.
metric (str, default="euclidean") – Metric used for distance computation.
n_epochs (int, optional) – Number of training epochs.
batch_size (int, default=1000) – Batch size used during training.
verbose (bool, default=False) – Whether to print backend training progress.
**kwargs (dict) – Additional keyword arguments forwarded to umap.parametric_umap.ParametricUMAP after signature filtering.
- model¶
Fitted Parametric UMAP estimator after fit.
- Type:
umap.parametric_umap.ParametricUMAP or None
See also
UMAPReducerNon-parametric UMAP with graph-based transform support.
TSNEReducerNonlinear neighborhood-preserving visualization method.
PHATEReducerDiffusion-based embedding for continuous trajectories.
IVISReducerNeural metric-learning-based embedding.
Examples
>>> import numpy as np >>> from coco_pipe.dim_reduction import ParametricUMAPReducer >>> X = np.random.rand(50, 10).astype(np.float32) >>> reducer = ParametricUMAPReducer(n_components=2, n_epochs=5, verbose=False) >>> _ = reducer.fit(X) >>> reducer.transform(X[:10]).shape (10, 2)
- property capabilities: dict¶
Return capability metadata for Parametric UMAP.
- Returns:
Capability mapping describing Parametric UMAP as a nonlinear stochastic reducer with transform support.
- Return type:
dict
- n_neighbors = 15¶
- min_dist = 0.1¶
- metric = 'euclidean'¶
- n_epochs = None¶
- batch_size = 1000¶
- verbose = False¶
- fit(X: coco_pipe.dim_reduction.reducers.base.ArrayLike, y: coco_pipe.dim_reduction.reducers.base.ArrayLike | None = None) ParametricUMAPReducer[source]¶
Fit Parametric UMAP on the input data.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Training data.
y (ArrayLike, optional) – Optional supervision supported by Parametric UMAP.
- Returns:
Fitted reducer instance.
- Return type:
- Raises:
ImportError – If umap-learn is not installed.
RuntimeError – If umap-learn is installed but fails during initialization.
- transform(X: coco_pipe.dim_reduction.reducers.base.ArrayLike) numpy.ndarray[source]¶
Project data using the fitted Parametric UMAP model.
- Parameters:
X (ArrayLike of shape (n_samples, n_features)) – Data to project.
- Returns:
Low-dimensional embedding coordinates.
- Return type:
np.ndarray of shape (n_samples, n_components)
- Raises:
RuntimeError – If the reducer has not been fitted.
- property loss_history_: list¶
Training loss history for the parametric model.
- Returns:
Recorded loss values across training epochs.
- Return type:
list
- Raises:
RuntimeError – If the reducer has not been fitted.