coco_pipe.decoding¶

Submodules¶

Classes¶

`ExperimentConfig`	Master configuration for a Decoding Experiment.
`Experiment`	Main executor for decoding experiments.

Functions¶

`get_estimator_cls`(→ Type)	Retrieve an estimator class by name.
`register_estimator`(→ Callable[[Type], Type])	Decorator to register an estimator class under a specific name.
`cross_validate_score`(→ float)	Compute one mean cross-validated score for an estimator.

Package Contents¶

class coco_pipe.decoding.ExperimentConfig(/, **data: Any)[source]¶

Bases: pydantic.BaseModel

Master configuration for a Decoding Experiment.

task: Literal['classification', 'regression'] = 'classification'¶

output_dir: pathlib.Path | None = None¶

tag: str = 'experiment'¶

models: Dict[str, EstimatorConfigType]¶

grids: Dict[str, Dict[str, List[Any]]] | None = None¶

cv: CVConfig = None¶

tuning: TuningConfig = None¶

feature_selection: FeatureSelectionConfig = None¶

metrics: List[str] = None¶

temporal: TemporalConfig = None¶

use_scaler: bool = None¶

n_jobs: int = -1¶

verbose: bool = True¶

class coco_pipe.decoding.Experiment(config: coco_pipe.decoding.configs.ExperimentConfig)[source]¶

Main executor for decoding experiments.

Parameters:: config (ExperimentConfig) – The complete configuration for the experiment.

config¶

results: Dict[str, Any]¶

_validate_config()[source]¶

Perform comprehensive runtime validation of the configuration.

Logic¶

Tuning Consistency: Warns if tuning.enabled but no grids are provided.
Task vs Metrics: Checks if metrics match the task (e.g. no ‘accuracy’ for regression). Raises ValueError if incompatible.
Task vs CV: Checks if CV strategy matches task (e.g. no ‘stratified’ for regression). Raises ValueError if incompatible.
Task vs Model: Heuristic check for model type (e.g. no Regressor for Classification). Raises ValueError if incompatible.

raises ValueError:: If configuration contains incompatible settings.

_prepare_estimator(model_name: str, model_config: Any) → sklearn.base.BaseEstimator[source]¶

Orchestrate the creation of the full Estimator Pipeline.

Steps¶

Instantiation: Calls _instantiate_model to get the base estimator (handling recursion).
Scaling: If use_scaler=True, prepends a StandardScaler.
Feature Selection: If enabled, prepends the FS step (Filter or Wrapper).
Pipeline Construction: wraps steps in sklearn.pipeline.Pipeline. - Enables caching if FS + Tuning are both active.
Tuning Wrapper: If tuning is enabled for this model, wraps the Pipeline in GridSearchCV/RandomizedSearchCV via _wrap_with_tuning.

param model_name:: Friendly name from config (used for grid lookup).
type model_name:: str
param model_config:: Pydantic configuration object for the model.
type model_config:: Any
returns:: Final ready-to-run estimator (Pipeline or SearchCV).
rtype:: BaseEstimator

_instantiate_model(name: str, config: Any) → sklearn.base.BaseEstimator[source]¶

Instantiate a raw estimator from its configuration object.

Logic¶

Registry Lookup: Resolves class from config.method.
Recursion: If config implies a meta-estimator (has base_estimator), recursively calls _prepare_estimator for the child.
Parameter Injection: passed config fields as kwargs to __init__. - Automatically filters out invalid parameters if TypeError occurs

(robustness for mismatched config/class versions).

returns:: The instantiated model (e.g., LogisticRegression instance) without pipeline wrappers.
rtype:: BaseEstimator

_create_fs_step(estimator: sklearn.base.BaseEstimator) → tuple | None[source]¶

Create a Feature Selection step for the pipeline.

Logic¶

Filter (k_best): Fast. selected before training the classifier based on statistical test. No inner CV loop required.
Wrapper (sfs): Slow but accurate. Wraps the estimator in a SequentialFeatureSelector. This runs an Inner CV Loop (size = config.feature_selection.cv) to validate feature subsets.

If used inside Hyperparameter Tuning, this step is part of the Pipeline, ensuring features are re-selected for every fold and every parameter combination (Nested Simplification).

returns:: (“fs”, Transformer) step for sklearn Pipeline.
rtype:: tuple or None

_wrap_with_tuning(estimator: sklearn.base.BaseEstimator, name: str) → sklearn.base.BaseEstimator[source]¶

Wrap the estimator (or pipeline) in a Hyperparameter Search object.

This implements Nested Cross-Validation (Middle Loop): 1. Input: A Pipeline (Scaler + FS + Classifier). 2. Search: Creates a GridSearchCV / RandomizedSearchCV. 3. Process:

For each fold of the tuning CV (defined by config.cv): - Train the Pipeline (including FS!) on the tuning train set. - Evaluate on the tuning validation set.

Finds the best (Hyperparameters + Features) combination.

Refits on the entire training set provided by the Outer Loop.

This ensures simultaneous optimization of Preprocessing (FS) and Modeling parameters.

run(X: pandas.DataFrame | numpy.ndarray, y: pandas.Series | numpy.ndarray, groups: pandas.Series | numpy.ndarray | None = None) → ExperimentResult[source]¶

Execute the full experiment pipeline.

This is the main entry point. It orchestrates: 1. Data Validation: Checks input shapes and types. 2. Model Loop: Iterates through all configured models. 3. Preparation: Instantiates models -> Builds Pipelines (Scaler/FS) ->

Wraps in Tuning.

Validation: Runs the Outer Cross-Validation loop (optionally parallelized).
Aggregation: Collects scores, predictions, and importances.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training data (2D) or Time-Series data (3D).
y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target labels or values.
groups (array-like of shape (n_samples,), optional) – Group labels for splitting (e.g., subject-specific splits).

Returns:

Object containing results with methods to export to Tidy DataFrames.

Return type:

ExperimentResult

save_results(path: str | pathlib.Path | None = None)[source]¶

Serialize results, configuration, and metadata to disk.

Parameters:: path (str or Path, optional) – Path to save the results. If None, uses config.output_dir. If both are None, raises ValueError.

static load_results(path: str | pathlib.Path) → ExperimentResult[source]¶

Load a saved experiment payload and wrap it in ExperimentResult.

Returns:: The loaded results wrapper.
Return type:: ExperimentResult

_cross_validate(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: numpy.ndarray, groups: numpy.ndarray | None) → Dict[str, Any][source]¶

Execute the Outer Cross-Validation Loop (Evaluation).

This is the Level 1 (Top Level) Splits: - Splits the entire dataset into K folds (defined by config.cv). - For each fold:

Training Data: 80% (if 5-fold). Passed to estimator.fit(). - If estimator is a GridSearch (Tuning Enabled), it will internally split

this 80% again for validation (Level 2 Split).

Test Data: 20%. Used strictly for final estimator.predict() evaluation.

Parallelization¶

If config.n_jobs > 1, these folds run in parallel processes to speed up execution.

_fit_and_score_fold(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: numpy.ndarray, train_idx: numpy.ndarray, test_idx: numpy.ndarray) → Dict[str, Any][source]¶

Execute a single Cross-Validation fold: Fit, Predict, and Score.

Optimized for: - Standard Estimators: (N, F) input -> (N,) output. - Sliding Estimators: (N, F, T) input -> (N, T) output (Diagonal Decoding).

Returns:: Contains ‘test_idx’, ‘preds’ (y_pred, y_true, y_proba), ‘scores’ (dict of metric values), and ‘importance’.
Return type:: dict

static _extract_metadata(estimator: sklearn.base.BaseEstimator) → Dict[str, Any][source]¶: Extract training metadata like best Hyperparameters and Selected Features.

static _compute_metric_safe(scorer, y_true, y_est, is_multiclass, is_proba=False)[source]¶

Compute metric handling standard and temporal (diagonal) shapes.

Shapes Handled¶

Standard: y_est is (N,) or (N, C)
Generalizing (Matrix): - y_pred: (N, T_train, T_test) -> Score each (T_train, T_test) pair. - y_proba: (N, C, T_train, T_test) -> Score each (T_train, T_test) pair.

_force_serial_execution(estimator: sklearn.base.BaseEstimator) → sklearn.base.BaseEstimator[source]¶: Recursively set n_jobs=1 for the estimator and its sub-components. Used when the outer loop is already parallelized to avoid oversubscription.

static _extract_feature_importances(estimator: sklearn.base.BaseEstimator) → numpy.ndarray | None[source]¶: Extract feature importances or coefficients from a fitted estimator. Handles Pipelines and Feature Selection.

coco_pipe.decoding.get_estimator_cls(name: str) → Type[source]¶

Retrieve an estimator class by name.

Parameters:: name (str) – Name of the estimator.
Returns:: The class object.
Return type:: Type
Raises:: ValueError – If name is not found.

coco_pipe.decoding.register_estimator(name: str) → Callable[[Type], Type][source]¶

Decorator to register an estimator class under a specific name.

Parameters:: name (str) – The unique alias for the estimator (e.g., “RandomForestClassifier”).

coco_pipe.decoding.cross_validate_score(estimator: sklearn.base.BaseEstimator, X: numpy.ndarray, y: Sequence, *, groups: Sequence | None = None, cv_config: coco_pipe.decoding.configs.CVConfig | None = None, metric: str = 'balanced_accuracy', use_scaler: bool = False) → float[source]¶

Compute one mean cross-validated score for an estimator.

Parameters:

estimator (BaseEstimator) – Estimator to fit inside each fold.
X (np.ndarray) – Input features with shape (n_samples, n_features).
y (sequence) – Target labels aligned with X.
groups (sequence, optional) – Group labels aligned with X.
cv_config (CVConfig, optional) – Cross-validation configuration. Defaults to a 5-fold stratified strategy, or 5-fold stratified-group strategy when groups are provided.
metric (str, default="balanced_accuracy") – Metric name resolved through get_scorer().
use_scaler (bool, default=False) – When True, wraps the estimator in a StandardScaler pipeline.

Returns:

Mean cross-validated score.

Return type:

float