tempor.automl.seeker module¶

Module containing the interface for, and the implemented hyperparameter seekers.

tempor.automl.seeker.TunerType¶

Hyperparameter tuner to use.

Available options:

"bayesian": Use a tuner based on optuna.samplers.TPESampler.
"random": Use a tuner based on optuna.samplers.RandomSampler.
"cmaes": Use a tuner based on optuna.samplers.CmaEsSampler.
"qmc": Use a tuner based on optuna.samplers.QMCSampler.
"grid": Use a tuner based on optuna.samplers.GridSampler.

alias of Literal[bayesian, random, cmaes, qmc, grid]

tempor.automl.seeker.TUNER_OPTUNA_SAMPLER_MAP : dict[Literal[bayesian] | Literal[random] | Literal[cmaes] | Literal[qmc] | Literal[grid], Any] = {'bayesian': <class 'optuna.samplers._tpe.sampler.TPESampler'>, 'cmaes': <class 'optuna.samplers._cmaes.CmaEsSampler'>, 'grid': <class 'optuna.samplers._grid.GridSampler'>, 'qmc': <class 'optuna.samplers._qmc.QMCSampler'>, 'random': <class 'optuna.samplers._random.RandomSampler'>}¶: A map from TunerType to the corresponding optuna sampler class

tempor.automl.seeker.evaluation_callback_dispatch(estimator: type[BasePredictor], dataset: PredictiveDataset, task_type: Literal[prediction.one_off.classification] | Literal[prediction.one_off.regression] | Literal[prediction.temporal.classification] | Literal[prediction.temporal.regression] | Literal[time_to_event] | Literal[treatments.one_off.classification] | Literal[treatments.one_off.regression] | Literal[treatments.temporal.classification] | Literal[treatments.temporal.regression], metric: str, n_cv_folds: int, random_state: int, horizon: list[float] | list[int] | list[Timestamp] | None, raise_exceptions: bool, silence_warnings: bool, *args: Any, **kwargs: Any) → float[source]¶

Perform evaluation of estimator (of task type task_type) on dataset, using the appropriate evaluation function from the tempor.benchmarks.evaluation module.

Parameters:¶

estimator : Type[BasePredictor]¶: The predictor estimator class to use.
dataset : PredictiveDataset¶: The dataset to use.
task_type : PredictiveTaskType¶: The task type of the predictor.
metric : str¶: The metric to be used for evaluation.
n_cv_folds : int¶: Number of cross-validation folds to use.
random_state : int¶: Random state used for data splitting.
horizon : Optional[data_typing.TimeIndex]¶: The prediction horizon. Applicable to the “time_to_event” task case.
raise_exceptions : bool¶: If set to True, if an exception is raised during evaluation, this will be raised and execution will be terminated. Otherwise the exception will be ignored and a dummy value returned.
silence_warnings : bool, optional¶: Whether to silence warnings raised. Defaults to False.
*args : Any: Positional arguments to pass to the estimator constructor.
**kwargs : Any: Keyword arguments to pass to the estimator constructor.

Returns:¶

The mean evaluation metric across the cross-validation folds.

Return type:¶

float

class tempor.automl.seeker.BaseSeeker(study_name: str, task_type: Literal[prediction.one_off.classification] | Literal[prediction.one_off.regression] | Literal[prediction.temporal.classification] | Literal[prediction.temporal.regression] | Literal[time_to_event] | Literal[treatments.one_off.classification] | Literal[treatments.one_off.regression] | Literal[treatments.temporal.classification] | Literal[treatments.temporal.regression], estimator_names: list[str], estimator_defs: list[Any], metric: str, dataset: PredictiveDataset, *, return_top_k: int = 3, num_cv_folds: int = 5, num_iter: int = 100, tuner_patience: int = 5, tuner_type: Literal[bayesian] | Literal[random] | Literal[cmaes] | Literal[qmc] | Literal[grid] = 'bayesian', timeout: int = 360, random_state: int = 0, override_hp_space: dict[str, list[Params]] | None = None, horizon: list[float] | list[int] | list[Timestamp] | None = None, compute_baseline_score: bool = False, grid: dict[str, dict[str, Any]] | None = None, custom_tuner: BaseTuner | None = None, raise_exceptions: bool = True, silence_warnings: bool = False, **kwargs: Any)[source]¶

Bases: ABC

The base class for an AutoML Seeker, to be derived from by concrete implementations. Provides an AutoML interface, in particular, the search method.

Parameters:¶

study_name : str¶: The name of the AutoML study (that is, the set of all individual AutoML trials).
task_type : PredictiveTaskType¶: The task type of the predictor estimators to be searched.
estimator_names : List[str]¶: Friendly names of estimators. Will be passed one-by-one to _init_estimator method calls.
estimator_defs : List[Any]¶: Definition of estimators. Will be passed one-by-one to _init_estimator method calls.
metric : str¶: The metric to use for evaluation.
dataset : PredictiveDataset¶: The dataset to use for evaluation.
return_top_k : int, optional¶: How many best estimators to return. Defaults to 3.
num_cv_folds : int, optional¶: How many cross-validation folds to use. Defaults to 5.
num_iter : int, optional¶: Number of AutoML iterations. Defaults to 100.
tuner_patience : int, optional¶: Patience of the AutoML tuner (for early-stopping). Defaults to 5.
tuner_type : TunerType, optional¶: The type of AutoML tuner to use. Defaults to "bayesian".
timeout : int, optional¶: AutoML optimization run time out (seconds). Defaults to 360.
random_state : int, optional¶: Random state to use. Defaults to 0.
override_hp_space : Optional[Dict[str, List[Params]]], optional¶: A dictionary with estimator_names keys and the hyperparameter space overrides values, e.g. {"my_estimator_A": [IntegerParams("some_param", low=1, high=100), ...], "my_estimator_B": ....}. Defaults to None.
horizon : Optional[data_typing.TimeIndex], optional¶: The prediction horizon for evaluation. Applicable to the “time_to_event” task case. Defaults to None.
compute_baseline_score : bool, optional¶: (If supported by the Seeker implementation.) Whether to run a baseline trial and compute its score. A baseline trial is a trial with all the default parameters. Defaults to False.
grid : Optional[Dict[str, Dict[str, Any]]], optional¶: (If supported by the Seeker implementation; only relevant to "grid" tuner type) The grid for the grid search tuner type. Keys are estimator_names`, values are ``(param_name: str -> List[param_value_candidate: Any]). Defaults to None.
custom_tuner : Optional[BaseTuner], optional¶: Pass a custom_tuner to override the default AutoML tuner for tuner_type. Defaults to None.
raise_exceptions : bool, optional¶: If set to True, if an exception is raised during AutoML study run, this will be raised and execution will be terminated. Otherwise the exception will be ignored. Defaults to True.
silence_warnings : bool, optional¶: Whether to silence warnings raised. Some dependencies (e.g. xgbse) may circumvent this and raise warnings regardless. Defaults to False.
**kwargs : Any: Currently unused.

Raises:¶

ValueError – If incompatible / invalid input arguments have been passed.

search() → tuple[list[BasePredictor], list[float]][source]¶

Perform AutoML search.

Returns:¶: (best_estimators, best_scores), the best estimators and the corresponding base scores returned.
Return type:¶: Tuple[List[BasePredictor], List[float]]

class tempor.automl.seeker.MethodSeeker(study_name: str, task_type: Literal[prediction.one_off.classification] | Literal[prediction.one_off.regression] | Literal[prediction.temporal.classification] | Literal[prediction.temporal.regression] | Literal[time_to_event] | Literal[treatments.one_off.classification] | Literal[treatments.one_off.regression] | Literal[treatments.temporal.classification] | Literal[treatments.temporal.regression], estimator_names: list[str], metric: str, dataset: PredictiveDataset, *, return_top_k: int = 3, num_cv_folds: int = 5, num_iter: int = 100, tuner_patience: int = 5, tuner_type: Literal[bayesian] | Literal[random] | Literal[cmaes] | Literal[qmc] | Literal[grid] = 'bayesian', timeout: int = 360, random_state: int = 0, override_hp_space: dict[str, list[Params]] | None = None, horizon: list[float] | list[int] | list[Timestamp] | None = None, compute_baseline_score: bool = False, grid: dict[str, dict[str, Any]] | None = None, custom_tuner: BaseTuner | None = None, raise_exceptions: bool = True, silence_warnings: bool = False, **kwargs: Any)[source]¶

Bases: BaseSeeker

An AutoML seeker which will search the hyperparameter space of each of the predictor estimators defined in estimator_names for the task_type task setting.

Parameters:¶

study_name : str¶: See BaseSeeker.
task_type : PredictiveTaskType¶: See BaseSeeker.
estimator_names : List[str]¶: The candidate predictors. Provide plugin names (without category qualification), e.g. like ["nn_classifier", "cde_classifier"].
metric : str¶: See BaseSeeker.
dataset : PredictiveDataset¶: See BaseSeeker.
return_top_k : int, optional¶: See BaseSeeker.
num_cv_folds : int, optional¶: See BaseSeeker.
num_iter : int, optional¶: See BaseSeeker.
tuner_patience : int, optional¶: See BaseSeeker.
tuner_type : TunerType, optional¶: See BaseSeeker.
timeout : int, optional¶: See BaseSeeker.
random_state : int, optional¶: See BaseSeeker.
override_hp_space : Optional[Dict[str, List[Params]]], optional¶: See BaseSeeker.
horizon : Optional[data_typing.TimeIndex], optional¶: See BaseSeeker.
compute_baseline_score : bool, optional¶: See BaseSeeker.
grid : Optional[Dict[str, Dict[str, Any]]], optional¶: See BaseSeeker.
custom_tuner : Optional[BaseTuner], optional¶: See BaseSeeker.
raise_exceptions : bool, optional¶: See BaseSeeker.
silence_warnings : bool, optional¶: See BaseSeeker.
**kwargs : Any: See BaseSeeker.

class tempor.automl.seeker.PipelineSeeker(study_name: str, task_type: Literal[prediction.one_off.classification] | Literal[prediction.one_off.regression] | Literal[prediction.temporal.classification] | Literal[prediction.temporal.regression] | Literal[time_to_event] | Literal[treatments.one_off.classification] | Literal[treatments.one_off.regression] | Literal[treatments.temporal.classification] | Literal[treatments.temporal.regression], estimator_names: list[str], metric: str, dataset: PredictiveDataset, *, static_imputers: list[str] = ['static_tabular_imputer'], static_scalers: list[str] = ['static_standard_scaler', 'static_minmax_scaler'], temporal_imputers: list[str] = ['bfill', 'ts_tabular_imputer', 'ffill'], temporal_scalers: list[str] = ['ts_minmax_scaler', 'ts_standard_scaler'], return_top_k: int = 3, num_cv_folds: int = 5, num_iter: int = 100, tuner_patience: int = 5, tuner_type: Literal[bayesian] | Literal[random] | Literal[cmaes] | Literal[qmc] | Literal[grid] = 'bayesian', timeout: int = 360, random_state: int = 0, override_hp_space: dict[str, list[Params]] | None = None, horizon: list[float] | list[int] | list[Timestamp] | None = None, compute_baseline_score: bool = False, grid: dict[str, dict[str, Any]] | None = None, custom_tuner: BaseTuner | None = None, raise_exceptions: bool = True, silence_warnings: bool = False, **kwargs: Any)[source]¶

Bases: BaseSeeker

An AutoML seeker which will sample pipelines comprised of:

A static imputer (if at lease one candidate in static_imputers provided)
A static scaler (if at lease one candidate in static_scalers provided)
A temporal imputer (if at lease one candidate in temporal_imputers provided)
A temporal scaler (if at lease one candidate in temporal_scalers provided)
The final predictor, from the estimator_names options.

The imputer/scaler candidates will be sampled as a categorical hyperparameter. The hyperparameter spaces of these, and of the final predictor, will be sampled.

Note

compute_baseline_score=True is not supported, as the pipeline is dynamic and there is no defined baseline case.
tuner_type="grid" is not currently supported.

Parameters:¶

study_name : str¶: See BaseSeeker.
task_type : PredictiveTaskType¶: See BaseSeeker.
estimator_names : List[str]¶: The candidate predictors that will be the last step of the pipeline. Provide plugin names (without category qualification), e.g. like ["nn_classifier", "cde_classifier"].
metric : str¶: See BaseSeeker.
dataset : PredictiveDataset¶: See BaseSeeker.
static_imputers : List[str], optional¶: A list of candidate static imputers. Defaults to DEFAULT_STATIC_IMPUTERS.
static_scalers : List[str], optional¶: A list of candidate static scalers. Defaults to DEFAULT_STATIC_SCALERS.
temporal_imputers : List[str], optional¶: A list of candidate temporal imputers. Defaults to DEFAULT_TEMPORAL_IMPUTERS.
temporal_scalers : List[str], optional¶: A list of candidate temporal scalers. Defaults to DEFAULT_TEMPORAL_SCALERS.
return_top_k : int, optional¶: See BaseSeeker.
num_cv_folds : int, optional¶: See BaseSeeker.
num_iter : int, optional¶: See BaseSeeker.
tuner_patience : int, optional¶: See BaseSeeker.
tuner_type : TunerType, optional¶: See BaseSeeker.
timeout : int, optional¶: See BaseSeeker.
random_state : int, optional¶: See BaseSeeker.
override_hp_space : Optional[Dict[str, List[Params]]], optional¶: See BaseSeeker. Note that currently the hyperparameter space override in this case can only be specified for the last pipeline step (the predictive estimator), not the preceding data transformer steps. The default hyperparameter space will always be used for those.
horizon : Optional[data_typing.TimeIndex], optional¶: See BaseSeeker.
compute_baseline_score : bool, optional¶: See BaseSeeker.
grid : Optional[Dict[str, Dict[str, Any]]], optional¶: See BaseSeeker.
custom_tuner : Optional[BaseTuner], optional¶: See BaseSeeker.
raise_exceptions : bool, optional¶: See BaseSeeker.
silence_warnings : bool, optional¶: See BaseSeeker.
**kwargs : Any: See BaseSeeker.