coco_pipe.dim_reduction.reducers.base ===================================== .. py:module:: coco_pipe.dim_reduction.reducers.base .. autoapi-nested-parse:: Base interfaces for dimensionality reduction backends. This module defines the reducer contract shared by built-in reducers and user-defined reducers. A reducer is any object derived from `BaseReducer` implementing `fit` and `transform`, optionally exposing diagnostics and scalar quality metadata through helper methods. The surrounding dim-reduction stack uses these interfaces to provide: - input validation through the reducer `capabilities` mapping - standardized persistence with `save` and `load` - reducer-aware reporting and visualization hooks - optional dependency loading through `coco_pipe.utils.import_optional_dependency` .. rubric:: Notes `BaseReducer` is the intended extension point for custom reducers. Third-party reducers can participate in `DimReduction` workflows without extra wrappers as long as they respect the method contract documented here. Attributes ---------- .. autoapisummary:: coco_pipe.dim_reduction.reducers.base.ArrayLike Classes ------- .. autoapisummary:: coco_pipe.dim_reduction.reducers.base.BaseReducer Module Contents --------------- .. py:data:: ArrayLike .. py:class:: BaseReducer(n_components: int = 2, **kwargs) Bases: :py:obj:`abc.ABC` Abstract base class for all dimensionality reduction implementations. This class defines the standard interface that all reducers must implement and is safe to subclass for custom reducers. It provides built-in support for model persistence (save/load) using joblib. For custom reducers operating on nonstandard data layouts, override `capabilities` so the manager layer can route validation, scoring, plotting, and reporting correctly. :param n_components: Target dimensionality of the reduced representation. :type n_components: int, default=2 :param \*\*kwargs: Additional keyword arguments stored on `params` and typically forwarded to the wrapped estimator or backend implementation. :type \*\*kwargs: dict .. attribute:: n_components Target dimensionality of the reduced representation. :type: int .. attribute:: params Additional reducer parameters captured at initialization time. :type: dict .. attribute:: model Underlying fitted model object, such as a scikit-learn estimator or a scientific computing backend. This attribute should be populated by `fit`. :type: Any .. rubric:: Notes The `capabilities` property returns a plain dictionary consumed by the manager and evaluation layers. Custom reducers should declare supported diagnostics and scalar metadata explicitly through this mapping. Common keys include: - `input_ndim` : expected dimensionality of the input container - `input_layout` : semantic layout name such as `"standard"` - `has_transform` : whether `transform` is supported - `has_inverse_transform` : whether inverse transforms are available - `has_components` : whether PCA-like components are exposed - `supported_diagnostics` : names returned by `get_diagnostics` - `has_native_plot` : whether the reducer exposes its own plotting path - `is_linear` : whether the reducer is linear - `is_stochastic` : whether repeated runs can vary without a fixed seed .. rubric:: Examples >>> from sklearn.decomposition import PCA >>> from coco_pipe.dim_reduction import BaseReducer >>> >>> class CustomPCAReducer(BaseReducer): ... @property ... def capabilities(self): ... return self._merge_capabilities( ... super().capabilities, ... is_linear=True, ... has_components=True, ... supported_diagnostics=("explained_variance_ratio_",), ... ) ... ... def fit(self, X, y=None): ... self.model = PCA(n_components=self.n_components, **self.params) ... self.model.fit(X) ... return self ... ... def transform(self, X): ... return self.model.transform(X) .. py:attribute:: n_components :value: 2 .. py:attribute:: params .. py:attribute:: model :value: None .. py:attribute:: context_ :type: Dict[str, Any] .. py:property:: name :type: str Return a stable public display name for the reducer. .. py:method:: _filter_params(fn_or_class: Any, params: dict) -> dict Filter parameters to match the signature of a function or class. :param fn_or_class: The function or class to inspect. :type fn_or_class: Any :param params: The parameters to filter. :type params: dict :returns: **filtered_params** -- Parameters present in the signature. If the target accepts ``**kwargs`` or its signature cannot be inspected, the original parameter dictionary is returned unchanged. :rtype: dict .. rubric:: Notes This is a convenience helper for reducer implementations that wrap third-party estimators with partially overlapping constructor signatures. .. py:method:: _build_estimator(estimator_cls: Any, params: Optional[dict] = None, component_param: Optional[str] = 'n_components', **fixed_kwargs: Any) -> Any Instantiate an estimator with filtered reducer parameters. :param estimator_cls: Estimator class to instantiate. :type estimator_cls: Any :param params: Explicit parameter dictionary to filter instead of `self.params`. :type params: dict, optional :param component_param: Name of the constructor argument receiving `self.n_components`. Set to ``None`` to skip injecting the component count. :type component_param: str or None, default="n_components" :param \*\*fixed_kwargs: Keyword arguments always forwarded to the estimator constructor. :type \*\*fixed_kwargs: dict :returns: Instantiated estimator. :rtype: Any .. rubric:: Notes This helper assumes the wrapped backend is constructor-driven and can be configured from keyword arguments. .. py:method:: _require_fitted(method_name: str = 'transform', model: Any = None) -> Any Validate that a reducer backend has been fitted before access. :param method_name: Operation requiring a fitted model. :type method_name: str, default="transform" :param model: Backend model to check. Defaults to `self.model`. :type model: Any, optional :returns: The validated model instance. :rtype: Any :raises RuntimeError: If no fitted model is available. .. py:method:: _merge_capabilities(base_caps: Dict[str, Any], **overrides: Any) -> Dict[str, Any] Return a capability mapping updated with reducer-specific overrides. :param base_caps: Base capability mapping, typically `super().capabilities`. :type base_caps: dict :param \*\*overrides: Reducer-specific capability values to apply. :type \*\*overrides: dict :returns: Capability mapping with overrides applied. :rtype: dict .. py:method:: fit(X: ArrayLike, y: Optional[ArrayLike] = None) -> BaseReducer :abstractmethod: Fit the model to the data. :param X: Training data. Most reducers expect `(n_samples, n_features)`, but reducers with custom `capabilities["input_layout"]` may accept other layouts such as snapshot matrices or grouped trajectory tensors. :type X: ArrayLike :param y: Optional supervision aligned with the sample axis used by the reducer's declared input layout. :type y: ArrayLike, optional :returns: **self** -- The fitted reducer instance. :rtype: BaseReducer .. rubric:: Notes Most reducers expect `X` to have shape `(n_samples, n_features)`. Some reducers operate on alternative layouts and should document those layouts through `capabilities`. .. py:method:: transform(X: ArrayLike) -> numpy.ndarray :abstractmethod: Apply dimensionality reduction to X. :param X: New data to transform. Its layout should match the reducer's declared `capabilities`. :type X: ArrayLike :returns: **X_new** -- Reduced representation. The exact output shape depends on the reducer, but the last dimension usually matches `n_components`. :rtype: np.ndarray :raises RuntimeError: Raised by concrete implementations when `transform` is called before fitting or when the reducer does not support out-of-sample transforms. .. py:method:: fit_transform(X: ArrayLike, y: Optional[ArrayLike] = None) -> numpy.ndarray Fit the model to data and return the transformed data. This method usually calls `fit` and then `transform`, but reducers may override it for efficiency if the underlying algorithm supports a native combined path. :param X: Training data following the reducer's declared layout. :type X: ArrayLike :param y: Optional supervision aligned with the reducer's input layout. :type y: ArrayLike, optional :returns: **X_new** -- Reduced representation returned by `transform`. :rtype: np.ndarray .. py:method:: save(filepath: Union[str, os.PathLike]) -> None Persist the reducer to a file. The default implementation serializes the reducer instance with joblib. Custom reducers should either remain joblib-serializable or override this method and `load()` with a custom persistence strategy. :param filepath: Path to the output file. :type filepath: str or Path .. rubric:: Notes The default implementation serializes the reducer instance with `joblib.dump`. Custom reducers should either remain joblib-serializable or override this method and `load` with a custom persistence strategy. .. py:property:: capabilities :type: Dict[str, Any] Return reducer capability flags consumed by the manager layer. Custom reducers with nonstandard inputs should override at least `input_ndim` and `input_layout`. Reducers exposing diagnostics or scalar quality metadata should declare them explicitly through `supported_diagnostics` and `supported_metadata`. :returns: Mapping of reducer capability flags. :rtype: dict .. rubric:: Notes The default capabilities describe a typical estimator consuming `(samples, features)` input and exposing `transform`. .. py:method:: _attribute_dict(obj: Any, attrs: Iterable[str]) -> Dict[str, Any] Extract requested attributes from a target object into a dictionary. This helper filters missing attributes and swallows common access errors (such as deferred scikit-learn properties) to return only what is currently available on the target. :param obj: Target object to inspect. :type obj: Any :param attrs: Attribute names to attempt to extract. :type attrs: iterable of str :returns: Mapping of available attribute names to their values. :rtype: dict .. py:method:: get_diagnostics() -> Dict[str, Any] Return diagnostic arrays or structured artifacts. Diagnostics are intended for non-scalar outputs such as explained variance curves, eigenvalues, modes, graphs, or training histories. Only names declared in `capabilities["supported_diagnostics"]` are queried. :returns: **diagnostics** -- Dictionary of diagnostic attributes declared in `capabilities["supported_diagnostics"]`. :rtype: dict :raises RuntimeError: If the reducer has not been fitted. .. py:method:: get_quality_metadata() -> Dict[str, Any] Return scalar metadata about the reduction process or quality. Typical examples include iteration counts, optimization stress, final loss values, or backend-specific convergence flags. Only names declared in `capabilities["supported_metadata"]` are queried. :returns: **metadata** -- Dictionary containing only scalar values corresponding to keys declared in `capabilities["supported_metadata"]`. :rtype: dict :raises RuntimeError: If the reducer has not been fitted. .. py:method:: get_components() -> numpy.ndarray Return reducer-defined component-like outputs. :returns: Reducer-defined component array. :rtype: np.ndarray :raises ValueError: If the reducer does not expose public components. .. py:method:: load(filepath: Union[str, os.PathLike]) -> BaseReducer :classmethod: Load a reducer from a file. :param filepath: Path to the file to load. :type filepath: str or Path :returns: **reducer** -- The loaded reducer instance. :rtype: BaseReducer .. rubric:: Notes This method assumes the reducer was serialized with `save` or a compatible `joblib.dump` call.