tempor.automl.seeker module

Module containing the interface for, and the implemented hyperparameter seekers.

tempor.automl.seeker.TunerType

Hyperparameter tuner to use.

Available options:

alias of Literal[bayesian, random, cmaes, qmc, grid]

tempor.automl.seeker.TUNER_OPTUNA_SAMPLER_MAP : dict[Literal[bayesian] | Literal[random] | Literal[cmaes] | Literal[qmc] | Literal[grid], Any] = {'bayesian': <class 'optuna.samplers._tpe.sampler.TPESampler'>, 'cmaes': <class 'optuna.samplers._cmaes.CmaEsSampler'>, 'grid': <class 'optuna.samplers._grid.GridSampler'>, 'qmc': <class 'optuna.samplers._qmc.QMCSampler'>, 'random': <class 'optuna.samplers._random.RandomSampler'>}

A map from TunerType to the corresponding optuna sampler class

tempor.automl.seeker.evaluation_callback_dispatch(estimator: type[BasePredictor], dataset: PredictiveDataset, task_type: Literal[prediction.one_off.classification] | Literal[prediction.one_off.regression] | Literal[prediction.temporal.classification] | Literal[prediction.temporal.regression] | Literal[time_to_event] | Literal[treatments.one_off.classification] | Literal[treatments.one_off.regression] | Literal[treatments.temporal.classification] | Literal[treatments.temporal.regression], metric: str, n_cv_folds: int, random_state: int, horizon: list[float] | list[int] | list[Timestamp] | None, raise_exceptions: bool, silence_warnings: bool, *args: Any, **kwargs: Any) float[source]

Perform evaluation of estimator (of task type task_type) on dataset, using the appropriate evaluation function from the tempor.benchmarks.evaluation module.

Parameters:
estimator : Type[BasePredictor]

The predictor estimator class to use.

dataset : PredictiveDataset

The dataset to use.

task_type : PredictiveTaskType

The task type of the predictor.

metric : str

The metric to be used for evaluation.

n_cv_folds : int

Number of cross-validation folds to use.

random_state : int

Random state used for data splitting.

horizon : Optional[data_typing.TimeIndex]

The prediction horizon. Applicable to the “time_to_event” task case.

raise_exceptions : bool

If set to True, if an exception is raised during evaluation, this will be raised and execution will be terminated. Otherwise the exception will be ignored and a dummy value returned.

silence_warnings : bool, optional

Whether to silence warnings raised. Defaults to False.

*args : Any

Positional arguments to pass to the estimator constructor.

**kwargs : Any

Keyword arguments to pass to the estimator constructor.

Returns:

The mean evaluation metric across the cross-validation folds.

Return type:

float

class tempor.automl.seeker.BaseSeeker(study_name: str, task_type: Literal[prediction.one_off.classification] | Literal[prediction.one_off.regression] | Literal[prediction.temporal.classification] | Literal[prediction.temporal.regression] | Literal[time_to_event] | Literal[treatments.one_off.classification] | Literal[treatments.one_off.regression] | Literal[treatments.temporal.classification] | Literal[treatments.temporal.regression], estimator_names: list[str], estimator_defs: list[Any], metric: str, dataset: PredictiveDataset, *, return_top_k: int = 3, num_cv_folds: int = 5, num_iter: int = 100, tuner_patience: int = 5, tuner_type: Literal[bayesian] | Literal[random] | Literal[cmaes] | Literal[qmc] | Literal[grid] = 'bayesian', timeout: int = 360, random_state: int = 0, override_hp_space: dict[str, list[Params]] | None = None, horizon: list[float] | list[int] | list[Timestamp] | None = None, compute_baseline_score: bool = False, grid: dict[str, dict[str, Any]] | None = None, custom_tuner: BaseTuner | None = None, raise_exceptions: bool = True, silence_warnings: bool = False, **kwargs: Any)[source]

Bases: ABC

The base class for an AutoML Seeker, to be derived from by concrete implementations. Provides an AutoML interface, in particular, the search method.

Parameters:
study_name : str

The name of the AutoML study (that is, the set of all individual AutoML trials).

task_type : PredictiveTaskType

The task type of the predictor estimators to be searched.

estimator_names : List[str]

Friendly names of estimators. Will be passed one-by-one to _init_estimator method calls.

estimator_defs : List[Any]

Definition of estimators. Will be passed one-by-one to _init_estimator method calls.

metric : str

The metric to use for evaluation.

dataset : PredictiveDataset

The dataset to use for evaluation.

return_top_k : int, optional

How many best estimators to return. Defaults to 3.

num_cv_folds : int, optional

How many cross-validation folds to use. Defaults to 5.

num_iter : int, optional

Number of AutoML iterations. Defaults to 100.

tuner_patience : int, optional

Patience of the AutoML tuner (for early-stopping). Defaults to 5.

tuner_type : TunerType, optional

The type of AutoML tuner to use. Defaults to "bayesian".

timeout : int, optional

AutoML optimization run time out (seconds). Defaults to 360.

random_state : int, optional

Random state to use. Defaults to 0.

override_hp_space : Optional[Dict[str, List[Params]]], optional

A dictionary with estimator_names keys and the hyperparameter space overrides values, e.g. {"my_estimator_A": [IntegerParams("some_param", low=1, high=100), ...], "my_estimator_B": ....}. Defaults to None.

horizon : Optional[data_typing.TimeIndex], optional

The prediction horizon for evaluation. Applicable to the “time_to_event” task case. Defaults to None.

compute_baseline_score : bool, optional

(If supported by the Seeker implementation.) Whether to run a baseline trial and compute its score. A baseline trial is a trial with all the default parameters. Defaults to False.

grid : Optional[Dict[str, Dict[str, Any]]], optional

(If supported by the Seeker implementation; only relevant to "grid" tuner type) The grid for the grid search tuner type. Keys are estimator_names`, values are ``(param_name: str -> List[param_value_candidate: Any]). Defaults to None.

custom_tuner : Optional[BaseTuner], optional

Pass a custom_tuner to override the default AutoML tuner for tuner_type. Defaults to None.

raise_exceptions : bool, optional

If set to True, if an exception is raised during AutoML study run, this will be raised and execution will be terminated. Otherwise the exception will be ignored. Defaults to True.

silence_warnings : bool, optional

Whether to silence warnings raised. Some dependencies (e.g. xgbse) may circumvent this and raise warnings regardless. Defaults to False.

**kwargs : Any

Currently unused.

Raises:

ValueError – If incompatible / invalid input arguments have been passed.

search() tuple[list[BasePredictor], list[float]][source]

Perform AutoML search.

Returns:

(best_estimators, best_scores), the best estimators and the corresponding base scores returned.

Return type:

Tuple[List[BasePredictor], List[float]]

class tempor.automl.seeker.MethodSeeker(study_name: str, task_type: Literal[prediction.one_off.classification] | Literal[prediction.one_off.regression] | Literal[prediction.temporal.classification] | Literal[prediction.temporal.regression] | Literal[time_to_event] | Literal[treatments.one_off.classification] | Literal[treatments.one_off.regression] | Literal[treatments.temporal.classification] | Literal[treatments.temporal.regression], estimator_names: list[str], metric: str, dataset: PredictiveDataset, *, return_top_k: int = 3, num_cv_folds: int = 5, num_iter: int = 100, tuner_patience: int = 5, tuner_type: Literal[bayesian] | Literal[random] | Literal[cmaes] | Literal[qmc] | Literal[grid] = 'bayesian', timeout: int = 360, random_state: int = 0, override_hp_space: dict[str, list[Params]] | None = None, horizon: list[float] | list[int] | list[Timestamp] | None = None, compute_baseline_score: bool = False, grid: dict[str, dict[str, Any]] | None = None, custom_tuner: BaseTuner | None = None, raise_exceptions: bool = True, silence_warnings: bool = False, **kwargs: Any)[source]

Bases: BaseSeeker

An AutoML seeker which will search the hyperparameter space of each of the predictor estimators defined in estimator_names for the task_type task setting.

Parameters:
study_name : str

See BaseSeeker.

task_type : PredictiveTaskType

See BaseSeeker.

estimator_names : List[str]

The candidate predictors. Provide plugin names (without category qualification), e.g. like ["nn_classifier", "cde_classifier"].

metric : str

See BaseSeeker.

dataset : PredictiveDataset

See BaseSeeker.

return_top_k : int, optional

See BaseSeeker.

num_cv_folds : int, optional

See BaseSeeker.

num_iter : int, optional

See BaseSeeker.

tuner_patience : int, optional

See BaseSeeker.

tuner_type : TunerType, optional

See BaseSeeker.

timeout : int, optional

See BaseSeeker.

random_state : int, optional

See BaseSeeker.

override_hp_space : Optional[Dict[str, List[Params]]], optional

See BaseSeeker.

horizon : Optional[data_typing.TimeIndex], optional

See BaseSeeker.

compute_baseline_score : bool, optional

See BaseSeeker.

grid : Optional[Dict[str, Dict[str, Any]]], optional

See BaseSeeker.

custom_tuner : Optional[BaseTuner], optional

See BaseSeeker.

raise_exceptions : bool, optional

See BaseSeeker.

silence_warnings : bool, optional

See BaseSeeker.

**kwargs : Any

See BaseSeeker.

class tempor.automl.seeker.PipelineSeeker(study_name: str, task_type: Literal[prediction.one_off.classification] | Literal[prediction.one_off.regression] | Literal[prediction.temporal.classification] | Literal[prediction.temporal.regression] | Literal[time_to_event] | Literal[treatments.one_off.classification] | Literal[treatments.one_off.regression] | Literal[treatments.temporal.classification] | Literal[treatments.temporal.regression], estimator_names: list[str], metric: str, dataset: PredictiveDataset, *, static_imputers: list[str] = ['static_tabular_imputer'], static_scalers: list[str] = ['static_standard_scaler', 'static_minmax_scaler'], temporal_imputers: list[str] = ['bfill', 'ts_tabular_imputer', 'ffill'], temporal_scalers: list[str] = ['ts_minmax_scaler', 'ts_standard_scaler'], return_top_k: int = 3, num_cv_folds: int = 5, num_iter: int = 100, tuner_patience: int = 5, tuner_type: Literal[bayesian] | Literal[random] | Literal[cmaes] | Literal[qmc] | Literal[grid] = 'bayesian', timeout: int = 360, random_state: int = 0, override_hp_space: dict[str, list[Params]] | None = None, horizon: list[float] | list[int] | list[Timestamp] | None = None, compute_baseline_score: bool = False, grid: dict[str, dict[str, Any]] | None = None, custom_tuner: BaseTuner | None = None, raise_exceptions: bool = True, silence_warnings: bool = False, **kwargs: Any)[source]

Bases: BaseSeeker

An AutoML seeker which will sample pipelines comprised of:
  • A static imputer (if at lease one candidate in static_imputers provided)

  • A static scaler (if at lease one candidate in static_scalers provided)

  • A temporal imputer (if at lease one candidate in temporal_imputers provided)

  • A temporal scaler (if at lease one candidate in temporal_scalers provided)

  • The final predictor, from the estimator_names options.

The imputer/scaler candidates will be sampled as a categorical hyperparameter. The hyperparameter spaces of these, and of the final predictor, will be sampled.

Note

  • compute_baseline_score=True is not supported, as the pipeline is dynamic and there is no defined baseline case.

  • tuner_type="grid" is not currently supported.

Parameters:
study_name : str

See BaseSeeker.

task_type : PredictiveTaskType

See BaseSeeker.

estimator_names : List[str]

The candidate predictors that will be the last step of the pipeline. Provide plugin names (without category qualification), e.g. like ["nn_classifier", "cde_classifier"].

metric : str

See BaseSeeker.

dataset : PredictiveDataset

See BaseSeeker.

static_imputers : List[str], optional

A list of candidate static imputers. Defaults to DEFAULT_STATIC_IMPUTERS.

static_scalers : List[str], optional

A list of candidate static scalers. Defaults to DEFAULT_STATIC_SCALERS.

temporal_imputers : List[str], optional

A list of candidate temporal imputers. Defaults to DEFAULT_TEMPORAL_IMPUTERS.

temporal_scalers : List[str], optional

A list of candidate temporal scalers. Defaults to DEFAULT_TEMPORAL_SCALERS.

return_top_k : int, optional

See BaseSeeker.

num_cv_folds : int, optional

See BaseSeeker.

num_iter : int, optional

See BaseSeeker.

tuner_patience : int, optional

See BaseSeeker.

tuner_type : TunerType, optional

See BaseSeeker.

timeout : int, optional

See BaseSeeker.

random_state : int, optional

See BaseSeeker.

override_hp_space : Optional[Dict[str, List[Params]]], optional

See BaseSeeker. Note that currently the hyperparameter space override in this case can only be specified for the last pipeline step (the predictive estimator), not the preceding data transformer steps. The default hyperparameter space will always be used for those.

horizon : Optional[data_typing.TimeIndex], optional

See BaseSeeker.

compute_baseline_score : bool, optional

See BaseSeeker.

grid : Optional[Dict[str, Dict[str, Any]]], optional

See BaseSeeker.

custom_tuner : Optional[BaseTuner], optional

See BaseSeeker.

raise_exceptions : bool, optional

See BaseSeeker.

silence_warnings : bool, optional

See BaseSeeker.

**kwargs : Any

See BaseSeeker.