coco_pipe.descriptors.core

Descriptor extraction planner and execution pipeline.

This module owns the config-bound runtime orchestration for descriptor extraction. It does not implement family-specific descriptor math; instead it:

  • validates the explicit runtime inputs accepted by the module

  • instantiates enabled descriptor families from typed config

  • plans shared PSD computation for compatible PSD consumers

  • executes one observation batch at a time with controlled parallelism

  • merges aligned family outputs into one flat descriptor matrix

Author: Hamza Abdelhedi (hamza.abdelhedi@umontreal.ca)

Classes

DescriptorPipeline

Run config-driven descriptor extraction on explicit arrays.

Module Contents

class coco_pipe.descriptors.core.DescriptorPipeline(config: coco_pipe.descriptors.configs.DescriptorConfig | collections.abc.Mapping[str, Any])[source]

Run config-driven descriptor extraction on explicit arrays.

Parameters:

config (DescriptorConfig or Mapping[str, Any]) – Typed descriptors configuration or a mapping accepted by DescriptorConfig.

config

Parsed descriptors configuration.

Type:

DescriptorConfig

extractors

Enabled family extractors in deterministic family order.

Type:

list of BaseDescriptorExtractor

signal_extractors

Enabled non-PSD extractors that consume raw signal batches directly.

Type:

list of BaseDescriptorExtractor

psd_groups

Planned PSD reuse groups derived once from the enabled extractors.

Type:

list of _PSDGroup

family_order

Deterministic family order used when merging batch-local outputs.

Type:

list of str

Notes

The pipeline is config-bound but runtime-stateless. Construction performs config parsing, corrected-band compatibility checks, and planner setup once. Each call to extract() then validates the explicit runtime inputs, executes the planned families, and returns one flat descriptor matrix plus any collected failures.

config
extractors: list[coco_pipe.descriptors.extractors.base.BaseDescriptorExtractor] = []
signal_extractors
psd_groups = []
family_order
extract(X: numpy.ndarray, ids: collections.abc.Sequence[Any] | numpy.ndarray | None = None, sfreq: float | None = None, channel_names: collections.abc.Sequence[str] | numpy.ndarray | None = None) dict[str, Any][source]

Extract descriptors from explicit NumPy inputs.

Parameters:
  • X (np.ndarray) – Signal array with shape (n_obs, n_channels, n_times).

  • ids (sequence or np.ndarray, optional) – Observation identifiers aligned with X.

  • sfreq (float, optional) – Sampling frequency in Hertz. Required when enabled families depend on spectral estimates or spectral entropy.

  • channel_names (sequence of str or np.ndarray, optional) – Channel labels. Required for channel-resolved outputs.

Returns:

Dictionary with keys X, descriptor_names, and failures.

Return type:

dict[str, Any]

Raises:
  • ValueError – If the explicit input contract is not satisfied.

  • ImportError – If an optional backend required by the enabled families is missing.

Notes

When runtime.on_error="warn", extraction still completes and stores failures in result["failures"] before emitting one aggregate warning at the pipeline level.

The returned row order always matches the input observation order.

pool_channels(result: collections.abc.Mapping[str, Any], channel_groups: collections.abc.Mapping[str, collections.abc.Sequence[str]]) dict[str, Any][source]

Pool sensor-level descriptor columns into grouped channel outputs.

Parameters:
  • result (mapping) – Standard descriptor result produced by extract().

  • channel_groups (mapping of str to sequence of str) – Channel groups used to replace sensor-level descriptor columns with grouped "chgrp-..." outputs.

Returns:

Descriptor result with grouped channel features and unchanged failures.

Return type:

dict[str, Any]

Raises:

ValueError – If the provided result is malformed or if any requested group cannot be formed from the sensor-level descriptor columns.