coco_pipe.fm

Foundation model pipelines for CoCo Pipe.

Submodules

Classes

CBRAModRegressionPipeline

Regression pipeline that routes features through the CBRAMod foundation

FoundationRegressor

Thin sklearn-compatible wrapper around a foundation model embedder.

Package Contents

class coco_pipe.fm.CBRAModRegressionPipeline(X: pandas.DataFrame | numpy.ndarray, y: pandas.Series | numpy.ndarray, embed_fn: Callable[[Any], numpy.ndarray], metrics: str | Sequence[str] | None = None, base_regressor: sklearn.base.BaseEstimator | None = None, hp_search_params: Dict[str, Sequence[Any]] | None = None, use_scaler: bool = False, random_state: int = 42, n_jobs: int = -1, cv_kwargs: Dict[str, Any] | None = None, groups: pandas.Series | numpy.ndarray | None = None, verbose: bool = False)

Bases: coco_pipe.ml.base.BasePipeline

Regression pipeline that routes features through the CBRAMod foundation model before fitting a lightweight regressor.

The interface mirrors RegressionPipeline for analysis_type dispatch but exposes a required embed_fn to obtain embeddings from the foundation model.

embed_fn
verbose = False
model_name = 'CBRAMod'
run(analysis_type: str = 'baseline', n_features: int | None = None, direction: str = 'forward', search_type: str = 'grid', n_iter: int = 50, scoring: str | None = None) Dict[str, Any]
class coco_pipe.fm.FoundationRegressor(embed_fn: Callable[[Any], numpy.ndarray], base_regressor: sklearn.base.BaseEstimator | None = None, multioutput: bool = False)

Bases: sklearn.base.BaseEstimator, sklearn.base.RegressorMixin

Thin sklearn-compatible wrapper around a foundation model embedder.

Parameters:
  • embed_fn (callable) – Callable that maps X to a 2D numpy array of embeddings. Either a __call__ or transform method will be used.

  • base_regressor (BaseEstimator, optional) – Downstream regressor trained on the embeddings. Defaults to Ridge.

  • multioutput (bool, optional) – If True, wraps the base regressor in MultiOutputRegressor for multi-target regression.

embed_fn
base_regressor
multioutput = False
_embed(X: Any) numpy.ndarray
fit(X: Any, y: Any)
predict(X: Any) numpy.ndarray