coco_pipe.dim_reduction.reducers.base¶
Base interfaces for dimensionality reduction backends.
This module defines the reducer contract shared by built-in reducers and user-defined reducers. A reducer is any object derived from BaseReducer implementing fit and transform, optionally exposing diagnostics and scalar quality metadata through helper methods.
The surrounding dim-reduction stack uses these interfaces to provide:
input validation through the reducer capabilities mapping
standardized persistence with save and load
reducer-aware reporting and visualization hooks
optional dependency loading through coco_pipe.utils.import_optional_dependency
Notes
BaseReducer is the intended extension point for custom reducers. Third-party reducers can participate in DimReduction workflows without extra wrappers as long as they respect the method contract documented here.
Attributes¶
Classes¶
Abstract base class for all dimensionality reduction implementations. |
Module Contents¶
- coco_pipe.dim_reduction.reducers.base.ArrayLike¶
- class coco_pipe.dim_reduction.reducers.base.BaseReducer(n_components: int = 2, **kwargs)[source]¶
Bases:
abc.ABCAbstract base class for all dimensionality reduction implementations.
This class defines the standard interface that all reducers must implement and is safe to subclass for custom reducers. It provides built-in support for model persistence (save/load) using joblib.
For custom reducers operating on nonstandard data layouts, override capabilities so the manager layer can route validation, scoring, plotting, and reporting correctly.
- Parameters:
n_components (int, default=2) – Target dimensionality of the reduced representation.
**kwargs (dict) – Additional keyword arguments stored on params and typically forwarded to the wrapped estimator or backend implementation.
- n_components¶
Target dimensionality of the reduced representation.
- Type:
int
- params¶
Additional reducer parameters captured at initialization time.
- Type:
dict
- model¶
Underlying fitted model object, such as a scikit-learn estimator or a scientific computing backend. This attribute should be populated by fit.
- Type:
Any
Notes
The capabilities property returns a plain dictionary consumed by the manager and evaluation layers. Custom reducers should declare supported diagnostics and scalar metadata explicitly through this mapping. Common keys include:
input_ndim : expected dimensionality of the input container
input_layout : semantic layout name such as “standard”
has_transform : whether transform is supported
has_inverse_transform : whether inverse transforms are available
has_components : whether PCA-like components are exposed
supported_diagnostics : names returned by get_diagnostics
has_native_plot : whether the reducer exposes its own plotting path
is_linear : whether the reducer is linear
is_stochastic : whether repeated runs can vary without a fixed seed
Examples
>>> from sklearn.decomposition import PCA >>> from coco_pipe.dim_reduction import BaseReducer >>> >>> class CustomPCAReducer(BaseReducer): ... @property ... def capabilities(self): ... return self._merge_capabilities( ... super().capabilities, ... is_linear=True, ... has_components=True, ... supported_diagnostics=("explained_variance_ratio_",), ... ) ... ... def fit(self, X, y=None): ... self.model = PCA(n_components=self.n_components, **self.params) ... self.model.fit(X) ... return self ... ... def transform(self, X): ... return self.model.transform(X)
- n_components = 2¶
- params¶
- model = None¶
- context_: Dict[str, Any]¶
- property name: str¶
Return a stable public display name for the reducer.
- _filter_params(fn_or_class: Any, params: dict) dict[source]¶
Filter parameters to match the signature of a function or class.
- Parameters:
fn_or_class (Any) – The function or class to inspect.
params (dict) – The parameters to filter.
- Returns:
filtered_params – Parameters present in the signature. If the target accepts
**kwargsor its signature cannot be inspected, the original parameter dictionary is returned unchanged.- Return type:
dict
Notes
This is a convenience helper for reducer implementations that wrap third-party estimators with partially overlapping constructor signatures.
- _build_estimator(estimator_cls: Any, params: dict | None = None, component_param: str | None = 'n_components', **fixed_kwargs: Any) Any[source]¶
Instantiate an estimator with filtered reducer parameters.
- Parameters:
estimator_cls (Any) – Estimator class to instantiate.
params (dict, optional) – Explicit parameter dictionary to filter instead of self.params.
component_param (str or None, default="n_components") – Name of the constructor argument receiving self.n_components. Set to
Noneto skip injecting the component count.**fixed_kwargs (dict) – Keyword arguments always forwarded to the estimator constructor.
- Returns:
Instantiated estimator.
- Return type:
Any
Notes
This helper assumes the wrapped backend is constructor-driven and can be configured from keyword arguments.
- _require_fitted(method_name: str = 'transform', model: Any = None) Any[source]¶
Validate that a reducer backend has been fitted before access.
- Parameters:
method_name (str, default="transform") – Operation requiring a fitted model.
model (Any, optional) – Backend model to check. Defaults to self.model.
- Returns:
The validated model instance.
- Return type:
Any
- Raises:
RuntimeError – If no fitted model is available.
- _merge_capabilities(base_caps: Dict[str, Any], **overrides: Any) Dict[str, Any][source]¶
Return a capability mapping updated with reducer-specific overrides.
- Parameters:
base_caps (dict) – Base capability mapping, typically super().capabilities.
**overrides (dict) – Reducer-specific capability values to apply.
- Returns:
Capability mapping with overrides applied.
- Return type:
dict
- abstract fit(X: ArrayLike, y: ArrayLike | None = None) BaseReducer[source]¶
Fit the model to the data.
- Parameters:
X (ArrayLike) – Training data. Most reducers expect (n_samples, n_features), but reducers with custom capabilities[“input_layout”] may accept other layouts such as snapshot matrices or grouped trajectory tensors.
y (ArrayLike, optional) – Optional supervision aligned with the sample axis used by the reducer’s declared input layout.
- Returns:
self – The fitted reducer instance.
- Return type:
Notes
Most reducers expect X to have shape (n_samples, n_features). Some reducers operate on alternative layouts and should document those layouts through capabilities.
- abstract transform(X: ArrayLike) numpy.ndarray[source]¶
Apply dimensionality reduction to X.
- Parameters:
X (ArrayLike) – New data to transform. Its layout should match the reducer’s declared capabilities.
- Returns:
X_new – Reduced representation. The exact output shape depends on the reducer, but the last dimension usually matches n_components.
- Return type:
np.ndarray
- Raises:
RuntimeError – Raised by concrete implementations when transform is called before fitting or when the reducer does not support out-of-sample transforms.
- fit_transform(X: ArrayLike, y: ArrayLike | None = None) numpy.ndarray[source]¶
Fit the model to data and return the transformed data.
This method usually calls fit and then transform, but reducers may override it for efficiency if the underlying algorithm supports a native combined path.
- Parameters:
X (ArrayLike) – Training data following the reducer’s declared layout.
y (ArrayLike, optional) – Optional supervision aligned with the reducer’s input layout.
- Returns:
X_new – Reduced representation returned by transform.
- Return type:
np.ndarray
- save(filepath: str | os.PathLike) None[source]¶
Persist the reducer to a file.
The default implementation serializes the reducer instance with joblib. Custom reducers should either remain joblib-serializable or override this method and load() with a custom persistence strategy.
- Parameters:
filepath (str or Path) – Path to the output file.
Notes
The default implementation serializes the reducer instance with joblib.dump. Custom reducers should either remain joblib-serializable or override this method and load with a custom persistence strategy.
- property capabilities: Dict[str, Any]¶
Return reducer capability flags consumed by the manager layer.
Custom reducers with nonstandard inputs should override at least input_ndim and input_layout. Reducers exposing diagnostics or scalar quality metadata should declare them explicitly through supported_diagnostics and supported_metadata.
- Returns:
Mapping of reducer capability flags.
- Return type:
dict
Notes
The default capabilities describe a typical estimator consuming (samples, features) input and exposing transform.
- _attribute_dict(obj: Any, attrs: Iterable[str]) Dict[str, Any][source]¶
Extract requested attributes from a target object into a dictionary.
This helper filters missing attributes and swallows common access errors (such as deferred scikit-learn properties) to return only what is currently available on the target.
- Parameters:
obj (Any) – Target object to inspect.
attrs (iterable of str) – Attribute names to attempt to extract.
- Returns:
Mapping of available attribute names to their values.
- Return type:
dict
- get_diagnostics() Dict[str, Any][source]¶
Return diagnostic arrays or structured artifacts.
Diagnostics are intended for non-scalar outputs such as explained variance curves, eigenvalues, modes, graphs, or training histories. Only names declared in capabilities[“supported_diagnostics”] are queried.
- Returns:
diagnostics – Dictionary of diagnostic attributes declared in capabilities[“supported_diagnostics”].
- Return type:
dict
- Raises:
RuntimeError – If the reducer has not been fitted.
- get_quality_metadata() Dict[str, Any][source]¶
Return scalar metadata about the reduction process or quality.
Typical examples include iteration counts, optimization stress, final loss values, or backend-specific convergence flags. Only names declared in capabilities[“supported_metadata”] are queried.
- Returns:
metadata – Dictionary containing only scalar values corresponding to keys declared in capabilities[“supported_metadata”].
- Return type:
dict
- Raises:
RuntimeError – If the reducer has not been fitted.
- get_components() numpy.ndarray[source]¶
Return reducer-defined component-like outputs.
- Returns:
Reducer-defined component array.
- Return type:
np.ndarray
- Raises:
ValueError – If the reducer does not expose public components.
- classmethod load(filepath: str | os.PathLike) BaseReducer[source]¶
Load a reducer from a file.
- Parameters:
filepath (str or Path) – Path to the file to load.
- Returns:
reducer – The loaded reducer instance.
- Return type:
Notes
This method assumes the reducer was serialized with save or a compatible joblib.dump call.