coco_pipe.dim_reduction.reducers.base

Base interfaces for dimensionality reduction backends.

This module defines the reducer contract shared by built-in reducers and user-defined reducers. A reducer is any object derived from BaseReducer implementing fit and transform, optionally exposing diagnostics and scalar quality metadata through helper methods.

The surrounding dim-reduction stack uses these interfaces to provide:

  • input validation through the reducer capabilities mapping

  • standardized persistence with save and load

  • reducer-aware reporting and visualization hooks

  • optional dependency loading through coco_pipe.utils.import_optional_dependency

Notes

BaseReducer is the intended extension point for custom reducers. Third-party reducers can participate in DimReduction workflows without extra wrappers as long as they respect the method contract documented here.

Attributes

Classes

BaseReducer

Abstract base class for all dimensionality reduction implementations.

Module Contents

coco_pipe.dim_reduction.reducers.base.ArrayLike
class coco_pipe.dim_reduction.reducers.base.BaseReducer(n_components: int = 2, **kwargs)[source]

Bases: abc.ABC

Abstract base class for all dimensionality reduction implementations.

This class defines the standard interface that all reducers must implement and is safe to subclass for custom reducers. It provides built-in support for model persistence (save/load) using joblib.

For custom reducers operating on nonstandard data layouts, override capabilities so the manager layer can route validation, scoring, plotting, and reporting correctly.

Parameters:
  • n_components (int, default=2) – Target dimensionality of the reduced representation.

  • **kwargs (dict) – Additional keyword arguments stored on params and typically forwarded to the wrapped estimator or backend implementation.

n_components

Target dimensionality of the reduced representation.

Type:

int

params

Additional reducer parameters captured at initialization time.

Type:

dict

model

Underlying fitted model object, such as a scikit-learn estimator or a scientific computing backend. This attribute should be populated by fit.

Type:

Any

Notes

The capabilities property returns a plain dictionary consumed by the manager and evaluation layers. Custom reducers should declare supported diagnostics and scalar metadata explicitly through this mapping. Common keys include:

  • input_ndim : expected dimensionality of the input container

  • input_layout : semantic layout name such as “standard”

  • has_transform : whether transform is supported

  • has_inverse_transform : whether inverse transforms are available

  • has_components : whether PCA-like components are exposed

  • supported_diagnostics : names returned by get_diagnostics

  • has_native_plot : whether the reducer exposes its own plotting path

  • is_linear : whether the reducer is linear

  • is_stochastic : whether repeated runs can vary without a fixed seed

Examples

>>> from sklearn.decomposition import PCA
>>> from coco_pipe.dim_reduction import BaseReducer
>>>
>>> class CustomPCAReducer(BaseReducer):
...     @property
...     def capabilities(self):
...         return self._merge_capabilities(
...             super().capabilities,
...             is_linear=True,
...             has_components=True,
...             supported_diagnostics=("explained_variance_ratio_",),
...         )
...
...     def fit(self, X, y=None):
...         self.model = PCA(n_components=self.n_components, **self.params)
...         self.model.fit(X)
...         return self
...
...     def transform(self, X):
...         return self.model.transform(X)
n_components = 2
params
model = None
context_: Dict[str, Any]
property name: str

Return a stable public display name for the reducer.

_filter_params(fn_or_class: Any, params: dict) dict[source]

Filter parameters to match the signature of a function or class.

Parameters:
  • fn_or_class (Any) – The function or class to inspect.

  • params (dict) – The parameters to filter.

Returns:

filtered_params – Parameters present in the signature. If the target accepts **kwargs or its signature cannot be inspected, the original parameter dictionary is returned unchanged.

Return type:

dict

Notes

This is a convenience helper for reducer implementations that wrap third-party estimators with partially overlapping constructor signatures.

_build_estimator(estimator_cls: Any, params: dict | None = None, component_param: str | None = 'n_components', **fixed_kwargs: Any) Any[source]

Instantiate an estimator with filtered reducer parameters.

Parameters:
  • estimator_cls (Any) – Estimator class to instantiate.

  • params (dict, optional) – Explicit parameter dictionary to filter instead of self.params.

  • component_param (str or None, default="n_components") – Name of the constructor argument receiving self.n_components. Set to None to skip injecting the component count.

  • **fixed_kwargs (dict) – Keyword arguments always forwarded to the estimator constructor.

Returns:

Instantiated estimator.

Return type:

Any

Notes

This helper assumes the wrapped backend is constructor-driven and can be configured from keyword arguments.

_require_fitted(method_name: str = 'transform', model: Any = None) Any[source]

Validate that a reducer backend has been fitted before access.

Parameters:
  • method_name (str, default="transform") – Operation requiring a fitted model.

  • model (Any, optional) – Backend model to check. Defaults to self.model.

Returns:

The validated model instance.

Return type:

Any

Raises:

RuntimeError – If no fitted model is available.

_merge_capabilities(base_caps: Dict[str, Any], **overrides: Any) Dict[str, Any][source]

Return a capability mapping updated with reducer-specific overrides.

Parameters:
  • base_caps (dict) – Base capability mapping, typically super().capabilities.

  • **overrides (dict) – Reducer-specific capability values to apply.

Returns:

Capability mapping with overrides applied.

Return type:

dict

abstract fit(X: ArrayLike, y: ArrayLike | None = None) BaseReducer[source]

Fit the model to the data.

Parameters:
  • X (ArrayLike) – Training data. Most reducers expect (n_samples, n_features), but reducers with custom capabilities[“input_layout”] may accept other layouts such as snapshot matrices or grouped trajectory tensors.

  • y (ArrayLike, optional) – Optional supervision aligned with the sample axis used by the reducer’s declared input layout.

Returns:

self – The fitted reducer instance.

Return type:

BaseReducer

Notes

Most reducers expect X to have shape (n_samples, n_features). Some reducers operate on alternative layouts and should document those layouts through capabilities.

abstract transform(X: ArrayLike) numpy.ndarray[source]

Apply dimensionality reduction to X.

Parameters:

X (ArrayLike) – New data to transform. Its layout should match the reducer’s declared capabilities.

Returns:

X_new – Reduced representation. The exact output shape depends on the reducer, but the last dimension usually matches n_components.

Return type:

np.ndarray

Raises:

RuntimeError – Raised by concrete implementations when transform is called before fitting or when the reducer does not support out-of-sample transforms.

fit_transform(X: ArrayLike, y: ArrayLike | None = None) numpy.ndarray[source]

Fit the model to data and return the transformed data.

This method usually calls fit and then transform, but reducers may override it for efficiency if the underlying algorithm supports a native combined path.

Parameters:
  • X (ArrayLike) – Training data following the reducer’s declared layout.

  • y (ArrayLike, optional) – Optional supervision aligned with the reducer’s input layout.

Returns:

X_new – Reduced representation returned by transform.

Return type:

np.ndarray

save(filepath: str | os.PathLike) None[source]

Persist the reducer to a file.

The default implementation serializes the reducer instance with joblib. Custom reducers should either remain joblib-serializable or override this method and load() with a custom persistence strategy.

Parameters:

filepath (str or Path) – Path to the output file.

Notes

The default implementation serializes the reducer instance with joblib.dump. Custom reducers should either remain joblib-serializable or override this method and load with a custom persistence strategy.

property capabilities: Dict[str, Any]

Return reducer capability flags consumed by the manager layer.

Custom reducers with nonstandard inputs should override at least input_ndim and input_layout. Reducers exposing diagnostics or scalar quality metadata should declare them explicitly through supported_diagnostics and supported_metadata.

Returns:

Mapping of reducer capability flags.

Return type:

dict

Notes

The default capabilities describe a typical estimator consuming (samples, features) input and exposing transform.

_attribute_dict(obj: Any, attrs: Iterable[str]) Dict[str, Any][source]

Extract requested attributes from a target object into a dictionary.

This helper filters missing attributes and swallows common access errors (such as deferred scikit-learn properties) to return only what is currently available on the target.

Parameters:
  • obj (Any) – Target object to inspect.

  • attrs (iterable of str) – Attribute names to attempt to extract.

Returns:

Mapping of available attribute names to their values.

Return type:

dict

get_diagnostics() Dict[str, Any][source]

Return diagnostic arrays or structured artifacts.

Diagnostics are intended for non-scalar outputs such as explained variance curves, eigenvalues, modes, graphs, or training histories. Only names declared in capabilities[“supported_diagnostics”] are queried.

Returns:

diagnostics – Dictionary of diagnostic attributes declared in capabilities[“supported_diagnostics”].

Return type:

dict

Raises:

RuntimeError – If the reducer has not been fitted.

get_quality_metadata() Dict[str, Any][source]

Return scalar metadata about the reduction process or quality.

Typical examples include iteration counts, optimization stress, final loss values, or backend-specific convergence flags. Only names declared in capabilities[“supported_metadata”] are queried.

Returns:

metadata – Dictionary containing only scalar values corresponding to keys declared in capabilities[“supported_metadata”].

Return type:

dict

Raises:

RuntimeError – If the reducer has not been fitted.

get_components() numpy.ndarray[source]

Return reducer-defined component-like outputs.

Returns:

Reducer-defined component array.

Return type:

np.ndarray

Raises:

ValueError – If the reducer does not expose public components.

classmethod load(filepath: str | os.PathLike) BaseReducer[source]

Load a reducer from a file.

Parameters:

filepath (str or Path) – Path to the file to load.

Returns:

reducer – The loaded reducer instance.

Return type:

BaseReducer

Notes

This method assumes the reducer was serialized with save or a compatible joblib.dump call.