coco_pipe.decoding
==================

.. py:module:: coco_pipe.decoding


Submodules
----------

.. toctree::
   :maxdepth: 1

   /autoapi/coco_pipe/decoding/configs/index
   /autoapi/coco_pipe/decoding/core/index
   /autoapi/coco_pipe/decoding/registry/index
   /autoapi/coco_pipe/decoding/utils/index


Classes
-------

.. autoapisummary::

   coco_pipe.decoding.ExperimentConfig
   coco_pipe.decoding.Experiment


Functions
---------

.. autoapisummary::

   coco_pipe.decoding.get_estimator_cls
   coco_pipe.decoding.register_estimator
   coco_pipe.decoding.cross_validate_score


Package Contents
----------------

.. py:class:: ExperimentConfig(/, **data: Any)

   Bases: :py:obj:`pydantic.BaseModel`


   Master configuration for a Decoding Experiment.


   .. py:attribute:: task
      :type:  Literal['classification', 'regression']
      :value: 'classification'


   .. py:attribute:: output_dir
      :type:  Optional[pathlib.Path]
      :value: None


   .. py:attribute:: tag
      :type:  str
      :value: 'experiment'


   .. py:attribute:: models
      :type:  Dict[str, EstimatorConfigType]


   .. py:attribute:: grids
      :type:  Optional[Dict[str, Dict[str, List[Any]]]]
      :value: None


   .. py:attribute:: cv
      :type:  CVConfig
      :value: None


   .. py:attribute:: tuning
      :type:  TuningConfig
      :value: None


   .. py:attribute:: feature_selection
      :type:  FeatureSelectionConfig
      :value: None


   .. py:attribute:: metrics
      :type:  List[str]
      :value: None


   .. py:attribute:: temporal
      :type:  TemporalConfig
      :value: None


   .. py:attribute:: use_scaler
      :type:  bool
      :value: None


   .. py:attribute:: n_jobs
      :type:  int
      :value: -1


   .. py:attribute:: verbose
      :type:  bool
      :value: True


.. py:class:: Experiment(config: coco_pipe.decoding.configs.ExperimentConfig)

   Main executor for decoding experiments.

   :param config: The complete configuration for the experiment.
   :type config: ExperimentConfig


   .. py:attribute:: config


   .. py:attribute:: results
      :type:  Dict[str, Any]


   .. py:method:: _validate_config()

      Perform comprehensive runtime validation of the configuration.

      Logic
      -----
      1. **Tuning Consistency**: Warns if `tuning.enabled` but no `grids`
         are provided.
      2. **Task vs Metrics**: Checks if metrics match the task (e.g. no 'accuracy'
         for regression). Raises ValueError if incompatible.
      3. **Task vs CV**: Checks if CV strategy matches task (e.g. no 'stratified'
         for regression). Raises ValueError if incompatible.
      4. **Task vs Model**: Heuristic check for model type (e.g. no Regressor for
         Classification). Raises ValueError if incompatible.

      :raises ValueError: If configuration contains incompatible settings.


   .. py:method:: _prepare_estimator(model_name: str, model_config: Any) -> sklearn.base.BaseEstimator

      Orchestrate the creation of the full Estimator Pipeline.

      Steps
      -----
      1. **Instantiation**: Calls `_instantiate_model` to get the base estimator
         (handling recursion).
      2. **Scaling**: If `use_scaler=True`, prepends a StandardScaler.
      3. **Feature Selection**: If enabled, prepends the FS step (Filter or Wrapper).
      4. **Pipeline Construction**: wraps steps in `sklearn.pipeline.Pipeline`.
         - Enables caching if FS + Tuning are both active.
      5. **Tuning Wrapper**: If tuning is enabled for this model, wraps the Pipeline
         in GridSearchCV/RandomizedSearchCV via `_wrap_with_tuning`.

      :param model_name: Friendly name from config (used for grid lookup).
      :type model_name: str
      :param model_config: Pydantic configuration object for the model.
      :type model_config: Any

      :returns: Final ready-to-run estimator (Pipeline or SearchCV).
      :rtype: BaseEstimator


   .. py:method:: _instantiate_model(name: str, config: Any) -> sklearn.base.BaseEstimator

      Instantiate a raw estimator from its configuration object.

      Logic
      -----
      1. **Registry Lookup**: Resolves class from `config.method`.
      2. **Recursion**: If config implies a meta-estimator (has `base_estimator`),
         recursively calls `_prepare_estimator` for the child.
      3. **Parameter Injection**: passed config fields as kwargs to `__init__`.
         - Automatically filters out invalid parameters if `TypeError` occurs
           (robustness for mismatched config/class versions).

      :returns: The instantiated model (e.g., LogisticRegression instance) without pipeline
                wrappers.
      :rtype: BaseEstimator


   .. py:method:: _create_fs_step(estimator: sklearn.base.BaseEstimator) -> Optional[tuple]

      Create a Feature Selection step for the pipeline.

      Logic
      -----
      - **Filter (k_best)**: Fast. selected before training the classifier based on
        statistical test. No inner CV loop required.
      - **Wrapper (sfs)**: Slow but accurate. Wraps the estimator in a
        SequentialFeatureSelector. This runs an **Inner CV Loop**
        (size = config.feature_selection.cv) to validate feature subsets.

      If used inside Hyperparameter Tuning, this step is part of the Pipeline,
      ensuring features are re-selected for every fold and every parameter
      combination (Nested Simplification).

      :returns: ("fs", Transformer) step for sklearn Pipeline.
      :rtype: tuple or None


   .. py:method:: _wrap_with_tuning(estimator: sklearn.base.BaseEstimator, name: str) -> sklearn.base.BaseEstimator

      Wrap the estimator (or pipeline) in a Hyperparameter Search object.

      This implements **Nested Cross-Validation** (Middle Loop):
      1. **Input**: A Pipeline (Scaler + FS + Classifier).
      2. **Search**: Creates a GridSearchCV / RandomizedSearchCV.
      3. **Process**:
         - For each fold of the *tuning* CV (defined by config.cv):
           - Train the Pipeline (including FS!) on the tuning train set.
           - Evaluate on the tuning validation set.
         - Finds the best (Hyperparameters + Features) combination.
         - Refits on the entire training set provided by the Outer Loop.

      This ensures simultaneous optimization of Preprocessing (FS) and Modeling
      parameters.


   .. py:method:: run(X: Union[pandas.DataFrame, numpy.ndarray], y: Union[pandas.Series, numpy.ndarray], groups: Optional[Union[pandas.Series, numpy.ndarray]] = None) -> ExperimentResult

      Execute the full experiment pipeline.

      This is the main entry point. It orchestrates:
      1. **Data Validation**: Checks input shapes and types.
      2. **Model Loop**: Iterates through all configured models.
      3. **Preparation**: Instantiates models -> Builds Pipelines (Scaler/FS) ->
         Wraps in Tuning.
      4. **Validation**: Runs the Outer Cross-Validation loop (optionally
         parallelized).
      5. **Aggregation**: Collects scores, predictions, and importances.

      :param X: Training data (2D) or Time-Series data (3D).
      :type X: array-like of shape (n_samples, n_features)
      :param y: Target labels or values.
      :type y: array-like of shape (n_samples,) or (n_samples, n_targets)
      :param groups: Group labels for splitting (e.g., subject-specific splits).
      :type groups: array-like of shape (n_samples,), optional

      :returns: Object containing results with methods to export to Tidy DataFrames.
      :rtype: ExperimentResult


   .. py:method:: save_results(path: Optional[Union[str, pathlib.Path]] = None)

      Serialize results, configuration, and metadata to disk.

      :param path: Path to save the results. If None, uses config.output_dir.
                   If both are None, raises ValueError.
      :type path: str or Path, optional


   .. py:method:: load_results(path: Union[str, pathlib.Path]) -> ExperimentResult
      :staticmethod:


      Load a saved experiment payload and wrap it in ExperimentResult.

      :returns: The loaded results wrapper.
      :rtype: ExperimentResult


   .. py:method:: _cross_validate(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: numpy.ndarray, groups: Optional[numpy.ndarray]) -> Dict[str, Any]

      Execute the Outer Cross-Validation Loop (Evaluation).

      This is the **Level 1 (Top Level)** Splits:
      - Splits the entire dataset into K folds (defined by config.cv).
      - For each fold:
        1. **Training Data**: 80% (if 5-fold). Passed to `estimator.fit()`.
           - If `estimator` is a GridSearch (Tuning Enabled), it will internally split
             this 80% again for validation (Level 2 Split).
        2. **Test Data**: 20%. Used strictly for final `estimator.predict()`
           evaluation.

      Parallelization
      ---------------
      If `config.n_jobs > 1`, these folds run in parallel processes to speed up
      execution.


   .. py:method:: _fit_and_score_fold(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: numpy.ndarray, train_idx: numpy.ndarray, test_idx: numpy.ndarray) -> Dict[str, Any]

      Execute a single Cross-Validation fold: Fit, Predict, and Score.

      Optimized for:
      - **Standard Estimators**: (N, F) input -> (N,) output.
      - **Sliding Estimators**: (N, F, T) input -> (N, T) output (Diagonal Decoding).

      :returns: Contains 'test_idx', 'preds' (y_pred, y_true, y_proba),
                'scores' (dict of metric values), and 'importance'.
      :rtype: dict


   .. py:method:: _extract_metadata(estimator: sklearn.base.BaseEstimator) -> Dict[str, Any]
      :staticmethod:


      Extract training metadata like best Hyperparameters and Selected Features.


   .. py:method:: _compute_metric_safe(scorer, y_true, y_est, is_multiclass, is_proba=False)
      :staticmethod:


      Compute metric handling standard and temporal (diagonal) shapes.

      Shapes Handled
      --------------
      - **Standard**: y_est is (N,) or (N, C)
      - **Generalizing (Matrix)**:
        - y_pred: (N, T_train, T_test) -> Score each (T_train, T_test) pair.
        - y_proba: (N, C, T_train, T_test) -> Score each (T_train, T_test) pair.


   .. py:method:: _force_serial_execution(estimator: sklearn.base.BaseEstimator) -> sklearn.base.BaseEstimator

      Recursively set n_jobs=1 for the estimator and its sub-components.
      Used when the outer loop is already parallelized to avoid oversubscription.


   .. py:method:: _extract_feature_importances(estimator: sklearn.base.BaseEstimator) -> Optional[numpy.ndarray]
      :staticmethod:


      Extract feature importances or coefficients from a fitted estimator.
      Handles Pipelines and Feature Selection.


.. py:function:: get_estimator_cls(name: str) -> Type

   Retrieve an estimator class by name.

   :param name: Name of the estimator.
   :type name: str

   :returns: The class object.
   :rtype: Type

   :raises ValueError: If name is not found.


.. py:function:: register_estimator(name: str) -> Callable[[Type], Type]

   Decorator to register an estimator class under a specific name.

   :param name: The unique alias for the estimator (e.g., "RandomForestClassifier").
   :type name: str


.. py:function:: cross_validate_score(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: Sequence, *, groups: Optional[Sequence] = None, cv_config: Optional[coco_pipe.decoding.configs.CVConfig] = None, metric: str = 'balanced_accuracy', use_scaler: bool = False) -> float

   Compute one mean cross-validated score for an estimator.

   :param estimator: Estimator to fit inside each fold.
   :type estimator: BaseEstimator
   :param X: Input features with shape ``(n_samples, n_features)``.
   :type X: np.ndarray
   :param y: Target labels aligned with ``X``.
   :type y: sequence
   :param groups: Group labels aligned with ``X``.
   :type groups: sequence, optional
   :param cv_config: Cross-validation configuration. Defaults to a 5-fold stratified
                     strategy, or 5-fold stratified-group strategy when groups are
                     provided.
   :type cv_config: CVConfig, optional
   :param metric: Metric name resolved through :func:`get_scorer`.
   :type metric: str, default="balanced_accuracy"
   :param use_scaler: When ``True``, wraps the estimator in a ``StandardScaler`` pipeline.
   :type use_scaler: bool, default=False

   :returns: Mean cross-validated score.
   :rtype: float