coco_pipe.decoding

Submodules

Classes

ExperimentConfig

Master configuration for a Decoding Experiment.

Experiment

Main executor for decoding experiments.

Functions

get_estimator_cls(→ Type)

Retrieve an estimator class by name.

register_estimator(→ Callable[[Type], Type])

Decorator to register an estimator class under a specific name.

cross_validate_score(→ float)

Compute one mean cross-validated score for an estimator.

Package Contents

class coco_pipe.decoding.ExperimentConfig(/, **data: Any)[source]

Bases: pydantic.BaseModel

Master configuration for a Decoding Experiment.

task: Literal['classification', 'regression'] = 'classification'
output_dir: pathlib.Path | None = None
tag: str = 'experiment'
models: Dict[str, EstimatorConfigType]
grids: Dict[str, Dict[str, List[Any]]] | None = None
cv: CVConfig = None
tuning: TuningConfig = None
feature_selection: FeatureSelectionConfig = None
metrics: List[str] = None
temporal: TemporalConfig = None
use_scaler: bool = None
n_jobs: int = -1
verbose: bool = True
class coco_pipe.decoding.Experiment(config: coco_pipe.decoding.configs.ExperimentConfig)[source]

Main executor for decoding experiments.

Parameters:

config (ExperimentConfig) – The complete configuration for the experiment.

config
results: Dict[str, Any]
_validate_config()[source]

Perform comprehensive runtime validation of the configuration.

Logic

  1. Tuning Consistency: Warns if tuning.enabled but no grids are provided.

  2. Task vs Metrics: Checks if metrics match the task (e.g. no ‘accuracy’ for regression). Raises ValueError if incompatible.

  3. Task vs CV: Checks if CV strategy matches task (e.g. no ‘stratified’ for regression). Raises ValueError if incompatible.

  4. Task vs Model: Heuristic check for model type (e.g. no Regressor for Classification). Raises ValueError if incompatible.

raises ValueError:

If configuration contains incompatible settings.

_prepare_estimator(model_name: str, model_config: Any) sklearn.base.BaseEstimator[source]

Orchestrate the creation of the full Estimator Pipeline.

Steps

  1. Instantiation: Calls _instantiate_model to get the base estimator (handling recursion).

  2. Scaling: If use_scaler=True, prepends a StandardScaler.

  3. Feature Selection: If enabled, prepends the FS step (Filter or Wrapper).

  4. Pipeline Construction: wraps steps in sklearn.pipeline.Pipeline. - Enables caching if FS + Tuning are both active.

  5. Tuning Wrapper: If tuning is enabled for this model, wraps the Pipeline in GridSearchCV/RandomizedSearchCV via _wrap_with_tuning.

param model_name:

Friendly name from config (used for grid lookup).

type model_name:

str

param model_config:

Pydantic configuration object for the model.

type model_config:

Any

returns:

Final ready-to-run estimator (Pipeline or SearchCV).

rtype:

BaseEstimator

_instantiate_model(name: str, config: Any) sklearn.base.BaseEstimator[source]

Instantiate a raw estimator from its configuration object.

Logic

  1. Registry Lookup: Resolves class from config.method.

  2. Recursion: If config implies a meta-estimator (has base_estimator), recursively calls _prepare_estimator for the child.

  3. Parameter Injection: passed config fields as kwargs to __init__. - Automatically filters out invalid parameters if TypeError occurs

    (robustness for mismatched config/class versions).

returns:

The instantiated model (e.g., LogisticRegression instance) without pipeline wrappers.

rtype:

BaseEstimator

_create_fs_step(estimator: sklearn.base.BaseEstimator) tuple | None[source]

Create a Feature Selection step for the pipeline.

Logic

  • Filter (k_best): Fast. selected before training the classifier based on statistical test. No inner CV loop required.

  • Wrapper (sfs): Slow but accurate. Wraps the estimator in a SequentialFeatureSelector. This runs an Inner CV Loop (size = config.feature_selection.cv) to validate feature subsets.

If used inside Hyperparameter Tuning, this step is part of the Pipeline, ensuring features are re-selected for every fold and every parameter combination (Nested Simplification).

returns:

(“fs”, Transformer) step for sklearn Pipeline.

rtype:

tuple or None

_wrap_with_tuning(estimator: sklearn.base.BaseEstimator, name: str) sklearn.base.BaseEstimator[source]

Wrap the estimator (or pipeline) in a Hyperparameter Search object.

This implements Nested Cross-Validation (Middle Loop): 1. Input: A Pipeline (Scaler + FS + Classifier). 2. Search: Creates a GridSearchCV / RandomizedSearchCV. 3. Process:

  • For each fold of the tuning CV (defined by config.cv): - Train the Pipeline (including FS!) on the tuning train set. - Evaluate on the tuning validation set.

  • Finds the best (Hyperparameters + Features) combination.

  • Refits on the entire training set provided by the Outer Loop.

This ensures simultaneous optimization of Preprocessing (FS) and Modeling parameters.

run(X: pandas.DataFrame | numpy.ndarray, y: pandas.Series | numpy.ndarray, groups: pandas.Series | numpy.ndarray | None = None) ExperimentResult[source]

Execute the full experiment pipeline.

This is the main entry point. It orchestrates: 1. Data Validation: Checks input shapes and types. 2. Model Loop: Iterates through all configured models. 3. Preparation: Instantiates models -> Builds Pipelines (Scaler/FS) ->

Wraps in Tuning.

  1. Validation: Runs the Outer Cross-Validation loop (optionally parallelized).

  2. Aggregation: Collects scores, predictions, and importances.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data (2D) or Time-Series data (3D).

  • y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target labels or values.

  • groups (array-like of shape (n_samples,), optional) – Group labels for splitting (e.g., subject-specific splits).

Returns:

Object containing results with methods to export to Tidy DataFrames.

Return type:

ExperimentResult

save_results(path: str | pathlib.Path | None = None)[source]

Serialize results, configuration, and metadata to disk.

Parameters:

path (str or Path, optional) – Path to save the results. If None, uses config.output_dir. If both are None, raises ValueError.

static load_results(path: str | pathlib.Path) ExperimentResult[source]

Load a saved experiment payload and wrap it in ExperimentResult.

Returns:

The loaded results wrapper.

Return type:

ExperimentResult

_cross_validate(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: numpy.ndarray, groups: numpy.ndarray | None) Dict[str, Any][source]

Execute the Outer Cross-Validation Loop (Evaluation).

This is the Level 1 (Top Level) Splits: - Splits the entire dataset into K folds (defined by config.cv). - For each fold:

  1. Training Data: 80% (if 5-fold). Passed to estimator.fit(). - If estimator is a GridSearch (Tuning Enabled), it will internally split

    this 80% again for validation (Level 2 Split).

  2. Test Data: 20%. Used strictly for final estimator.predict() evaluation.

Parallelization

If config.n_jobs > 1, these folds run in parallel processes to speed up execution.

_fit_and_score_fold(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: numpy.ndarray, train_idx: numpy.ndarray, test_idx: numpy.ndarray) Dict[str, Any][source]

Execute a single Cross-Validation fold: Fit, Predict, and Score.

Optimized for: - Standard Estimators: (N, F) input -> (N,) output. - Sliding Estimators: (N, F, T) input -> (N, T) output (Diagonal Decoding).

Returns:

Contains ‘test_idx’, ‘preds’ (y_pred, y_true, y_proba), ‘scores’ (dict of metric values), and ‘importance’.

Return type:

dict

static _extract_metadata(estimator: sklearn.base.BaseEstimator) Dict[str, Any][source]

Extract training metadata like best Hyperparameters and Selected Features.

static _compute_metric_safe(scorer, y_true, y_est, is_multiclass, is_proba=False)[source]

Compute metric handling standard and temporal (diagonal) shapes.

Shapes Handled

  • Standard: y_est is (N,) or (N, C)

  • Generalizing (Matrix): - y_pred: (N, T_train, T_test) -> Score each (T_train, T_test) pair. - y_proba: (N, C, T_train, T_test) -> Score each (T_train, T_test) pair.

_force_serial_execution(estimator: sklearn.base.BaseEstimator) sklearn.base.BaseEstimator[source]

Recursively set n_jobs=1 for the estimator and its sub-components. Used when the outer loop is already parallelized to avoid oversubscription.

static _extract_feature_importances(estimator: sklearn.base.BaseEstimator) numpy.ndarray | None[source]

Extract feature importances or coefficients from a fitted estimator. Handles Pipelines and Feature Selection.

coco_pipe.decoding.get_estimator_cls(name: str) Type[source]

Retrieve an estimator class by name.

Parameters:

name (str) – Name of the estimator.

Returns:

The class object.

Return type:

Type

Raises:

ValueError – If name is not found.

coco_pipe.decoding.register_estimator(name: str) Callable[[Type], Type][source]

Decorator to register an estimator class under a specific name.

Parameters:

name (str) – The unique alias for the estimator (e.g., “RandomForestClassifier”).

coco_pipe.decoding.cross_validate_score(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: Sequence, *, groups: Sequence | None = None, cv_config: coco_pipe.decoding.configs.CVConfig | None = None, metric: str = 'balanced_accuracy', use_scaler: bool = False) float[source]

Compute one mean cross-validated score for an estimator.

Parameters:
  • estimator (BaseEstimator) – Estimator to fit inside each fold.

  • X (np.ndarray) – Input features with shape (n_samples, n_features).

  • y (sequence) – Target labels aligned with X.

  • groups (sequence, optional) – Group labels aligned with X.

  • cv_config (CVConfig, optional) – Cross-validation configuration. Defaults to a 5-fold stratified strategy, or 5-fold stratified-group strategy when groups are provided.

  • metric (str, default="balanced_accuracy") – Metric name resolved through get_scorer().

  • use_scaler (bool, default=False) – When True, wraps the estimator in a StandardScaler pipeline.

Returns:

Mean cross-validated score.

Return type:

float