Usage#

To use SpectoPrep in a project:

import spectoprep

To call Pipeline Optimizer:

spectoprep.PipelineOptimizer(X_train: ndarray[tuple[int, ...], dtype[_ScalarType_co]], y_train: ndarray[tuple[int, ...], dtype[_ScalarType_co]], preprocessing_steps: List[str] | None = None, X_test: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | None = None, y_test: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | None = None, cv_method: str = 'group_shuffle_split', n_splits: int = 3, test_size: float = 0.3, n_groups_out: int = 2, random_state: int = 42, groups: ndarray[tuple[int, ...], dtype[_ScalarType_co]] | None = None, max_pipeline_length: int = 5, n_jobs: int = -1, allowed_preprocess_combinations: int | List[int] | Tuple[int, ...] | None = [1, 2], log_level: str = 'INFO')[source]#

A class for optimizing machine learning pipelines using Bayesian optimization. It precomputes possible pipeline configurations and then searches over both the pipeline configuration (encoded as an index) and the hyperparameters.