coco_pipe.io.load ================= .. py:module:: coco_pipe.io.load .. autoapi-nested-parse:: coco_pipe/io/load.py -------------------- High-level data loading factory. Author: Hamza Abdelhedi Functions --------- .. autoapisummary:: coco_pipe.io.load.load_data Module Contents --------------- .. py:function:: load_data(path: Union[str, pathlib.Path], mode: str = 'auto', target_col: Optional[str] = None, index_col: Optional[Union[str, int]] = None, sep: str = '\t', header: Optional[Union[int, List[int]]] = 0, sheet_name: Optional[Union[str, int]] = 0, columns_to_dims: Optional[List[str]] = None, col_sep: str = '_', meta_columns: Optional[List[str]] = None, clean: bool = False, clean_kwargs: Optional[Dict[str, Any]] = None, task: Optional[str] = None, session: Optional[Union[str, List[str]]] = None, datatype: str = 'eeg', suffix: Optional[str] = None, loading_mode: str = 'epochs', window_length: Optional[float] = None, stride: Optional[float] = None, subject_metadata_df: Optional[Any] = None, subject_key: Optional[str] = None, pattern: str = '*.pkl', dims: Tuple[str, Ellipsis] = ('obs', 'feature'), coords: Optional[Dict[str, Union[List, numpy.ndarray]]] = None, reader: Optional[Any] = None, id_fn: Optional[Any] = None, subjects: Optional[Union[str, List[str], int, List[int]]] = None, **kwargs) -> coco_pipe.io.structures.DataContainer Universal data loader factory. Dispatches to `BIDSDataset`, `TabularDataset`, or `EmbeddingDataset` based on `mode`. :param path: Path to data source (file or directory). :type path: str or Path :param mode: Type of data to load. - "auto": Infers type from file extension or directory structure. - "tabular": uses `TabularDataset` (CSV, TSV, Excel, TXT). - "bids": uses `BIDSDataset` (BIDS-compliant directories). - "embedding": uses `EmbeddingDataset` (NPY, PKL, H5, JSON). :type mode: {"auto", "tabular", "bids", "embedding"}, default="auto" :param Tabular Arguments (mode="tabular"): :param ----------------------------------: :param target_col: Name of the column to extract as target `y`. Removed from features `X`. :type target_col: str, optional :param index_col: Column to use as index (observation IDs). :type index_col: str or int, optional :param sep: Separator for text files (e.g. ',' for CSV). :type sep: str, default='\t' :param header: Row number(s) to use as column names. :type header: int or list of int, default=0 :param sheet_name: Sheet name or index for Excel files. :type sheet_name: str or int, default=0 :param columns_to_dims: If provided, attempts to reshape 2D feature columns into N-D dimensions. Columns must follow: `dim1_dim2_..._feature`. :type columns_to_dims: list of str, optional :param col_sep: Separator used in column names for reshaping. :type col_sep: str, default='_' :param meta_columns: Columns to extract as metadata coordinates instead of features. :type meta_columns: list of str, optional :param clean: Whether to perform automated cleaning (drop NaNs/Infs). :type clean: bool, default=False :param clean_kwargs: Arguments passed to `TabularDataset.clean`. :type clean_kwargs: dict, optional :param BIDS Arguments (mode="bids"): :param ----------------------------: :param task: BIDS task name (e.g., 'rest', 'audiovisual'). :type task: str, optional :param session: Session ID(s) to load. Defaults to all available. :type session: str or List[str], optional :param datatype: Data type folder (e.g., 'eeg', 'meg', 'ieeg'). :type datatype: str, default='eeg' :param suffix: File suffix to load (e.g., 'eeg', 'epo', 'ave'). :type suffix: str, optional :param loading_mode: How to process the data. passed as `mode` to BIDSDataset. - 'epochs': Splices continuous data into fixed-length windows. - 'continuous': Loads as single continuous segments. - 'load_existing': Loads pre-computed epochs. :type loading_mode: str, default='epochs' :param window_length: Window length in seconds (for 'epochs' mode). :type window_length: float, optional :param stride: Stride in seconds (for 'epochs' mode). :type stride: float, optional :param subject_metadata_df: External subject-level metadata to merge by subject during BIDS loading. :type subject_metadata_df: DataFrame, optional :param subject_key: Column in `subject_metadata_df` containing the BIDS subject identifier. :type subject_key: str, optional :param subjects: Specific subject IDs to load (without 'sub-'). :type subjects: str or List[str], optional :param Embedding Arguments (mode="embedding"): :param --------------------------------------: :param pattern: Glob pattern to match files. :type pattern: str, default='*.pkl' :param dims: Dimension labels for the data arrays. :type dims: tuple of str, default=('obs', 'feature') :param coords: Dictionary of coordinates for dimensions. :type coords: dict, optional :param reader: Custom file reader function. :type reader: callable, optional :param id_fn: Custom subject ID extraction function. :type id_fn: callable, optional :param subjects: If int, loads first N subjects. If list, filters by ID. :type subjects: int or list, optional :returns: Standardized data container with attributes: - X: (N_obs, ...) data array - y: Targets (if available) - ids: Observation identifiers - coords: Coordinate metadata :rtype: DataContainer