tempor.data.dataset module

Module defining the TemporAI dataset concept in BaseDataset and its derived classes.

tempor.data.dataset.EXCEPTION_MESSAGES = _ExceptionMessages()

Reusable error messages for the module.

class tempor.data.dataset.BaseDataset(time_series: DataFrame | ndarray, *, static: DataFrame | ndarray | None = None, targets: DataFrame | ndarray | None = None, treatments: DataFrame | ndarray | None = None, **kwargs: Any)[source]

Bases: ABC

Abstract base class representing a dataset used by TemporAI.

Initialize one of its derived classes (e.g. OneOffPredictionDataset, TimeToEventAnalysisDataset etc.) depending on the type of task.

See also tutorial tutorials/tutorial01_data_format.ipynb for examples of use.

Parameters:
time_series : data_typing.DataContainer

Data representing time series covariates of the samples. Will be initialized as TimeSeriesSamples.

static : Optional[data_typing.DataContainer], optional

Data representing static covariates of the samples. Will be initialized as StaticSamples. Defaults to None.

targets : Optional[data_typing.DataContainer], optional

Data representing target (outcome) feature(s) of the samples. Will be initialized as {TimeSeries,Static,Event}Samples depending on problem setting in the derived class. Defaults to None.

treatments : Optional[data_typing.DataContainer], optional

Data representing treatment (intervention) feature(s) of the samples. Will be initialized as {TimeSeries,Static,Event}Samples depending on problem setting in the derived class. Defaults to None.

**kwargs : Any

Additional keyword arguments to be passed to the derived class’s _init_predictive method.

predictive : PredictiveTaskData | None
property has_static : bool

A property returning whether the dataset has static data.

Returns:

Whether the dataset has static data.

Return type:

bool

property has_predictive_data : bool

A property returning whether the dataset has predictive data (targets or treatments).

Returns:

Whether the dataset has predictive data.

Return type:

bool

property predictive_task : PredictiveTask | None

A property returning the predictive task of the dataset (or None).

Returns:

The predictive task of the dataset.

Return type:

Union[data_typing.PredictiveTask, None]

validate() None[source]

Validate integrity of the dataset.

property time_series : TimeSeriesSamplesBase

The property containing the time series covariates of the dataset.

Returns:

The time series covariates of the dataset.

Return type:

samples.TimeSeriesSamplesBase

property static : StaticSamplesBase | None

The property containing the static covariates of the dataset.

Returns:

The static covariates of the dataset.

Return type:

Optional[samples.StaticSamplesBase]

abstract property fit_ready : bool

Returns whether the BaseDataset is in a state ready to be fit on.

train_test_split(*, test_size: float | None = None, train_size: float | None = None, random_state: int | RandomState | None = None, shuffle: bool = True, stratify: Any | None = None) tuple[Self, Self][source]

Split Dataset into train and test sets.

The arguments test_sizestratify are passed to sklearn.model_selection.train_test_split to generate the split.

Parameters:
test_size : Optional[float], optional

Passed to sklearn.model_selection.train_test_split. Defaults to None.

train_size : Optional[float], optional

Passed to sklearn.model_selection.train_test_split. Defaults to None.

random_state : Union[int, np.random.RandomState, None], optional

Passed to sklearn.model_selection.train_test_split. Defaults to None.

shuffle : bool, optional

Passed to sklearn.model_selection.train_test_split. Defaults to True.

stratify : Any, optional

Passed to sklearn.model_selection.train_test_split. Defaults to None.

Returns:

The split tuple (dataset_train, dataset_test).

Return type:

Tuple[Self, Self]

split(splitter: KFold | StratifiedKFold, **kwargs: Any) Generator[tuple[Self, Self], None, None][source]

Generate dataset splits according to the scikit-learn splitter (Splitter). The kwargs are passed to the underlying splitter’s split method.

Example

>>> from sklearn.model_selection import KFold
>>> from tempor import plugin_loader
>>> data = plugin_loader.get("prediction.one_off.sine", plugin_type="datasource").load()
>>> kfold = KFold(n_splits=5)
>>> len([(data_train, data_test) for (data_train, data_test) in data.split(splitter=kfold)])
5
Parameters:
splitter : Splitter

A sklearn splitter.

**kwargs : Any

Additional keyword arguments to be passed to the splitter’s split method.

Yields:

Tuple[Self, Self](dataset_train, dataset_test) for each split.

class tempor.data.dataset.CovariatesDataset(time_series: DataFrame | ndarray, *, static: DataFrame | ndarray | None = None, targets: DataFrame | ndarray | None = None, treatments: DataFrame | ndarray | None = None, **kwargs: Any)[source]

Bases: BaseDataset

A BaseDataset subclass for a dataset that does not contain any predictive data (targets or treatments).

property fit_ready : bool

Check if the dataset is ready to be fit on.

Returns:

Whether the dataset is ready to be fit on.

Return type:

bool

predictive : PredictiveTaskData | None
class tempor.data.dataset.PredictiveDataset(time_series: DataFrame | ndarray, *, targets: DataFrame | ndarray | None, static: DataFrame | ndarray | None = None, treatments: DataFrame | ndarray | None = None, **kwargs: Any)[source]

Bases: BaseDataset

A BaseDataset subclass for a dataset that can contain predictive data (targets or treatments).

This is an abstract class, to be derived from for different predictive task -specific Dataset s.

predictive : PredictiveTaskData
abstract property predict_ready : bool

Returns whether the PredictiveDataset is in a state ready to be predict ed on.

class tempor.data.dataset.OneOffPredictionDataset(time_series: DataFrame | ndarray, *, targets: DataFrame | ndarray | None, static: DataFrame | ndarray | None = None, treatments: DataFrame | ndarray | None = None, **kwargs: Any)[source]

Bases: PredictiveDataset

A PredictiveDataset subclass for the one-off prediction problem setting, see BaseDataset docs.

In this setting: targets are required for fitting, will be initialized as StaticSamples.

predictive : OneOffPredictionTaskData
property fit_ready : bool

Check if the dataset is ready to be fit on.

Returns:

Whether the dataset is ready to be fit on.

Return type:

bool

property predict_ready : bool

Check if the dataset is ready to be predicted on.

Returns:

Whether the dataset is ready to be predicted on.

Return type:

bool

class tempor.data.dataset.TemporalPredictionDataset(time_series: DataFrame | ndarray, *, targets: DataFrame | ndarray | None, static: DataFrame | ndarray | None = None, treatments: DataFrame | ndarray | None = None, **kwargs: Any)[source]

Bases: PredictiveDataset

A PredictiveDataset subclass for the temporal prediction problem setting, see BaseDataset docs.

In this setting: targets are required for fitting, will be initialized as TimeSeriesSamples.

predictive : TemporalPredictionTaskData
property fit_ready : bool

Check if the dataset is ready to be fit on.

Returns:

Whether the dataset is ready to be fit on.

Return type:

bool

property predict_ready : bool

Check if the dataset is ready to be predicted on.

Returns:

Whether the dataset is ready to be predicted on.

Return type:

bool

class tempor.data.dataset.TimeToEventAnalysisDataset(time_series: DataFrame | ndarray, *, targets: DataFrame | ndarray | None, static: DataFrame | ndarray | None = None, treatments: DataFrame | ndarray | None = None, **kwargs: Any)[source]

Bases: PredictiveDataset

A PredictiveDataset subclass for the time-to-event analysis problem setting, see BaseDataset docs.

In this setting: targets are required for fitting, will be initialized as EventSamples.

predictive : TimeToEventAnalysisTaskData
property fit_ready : bool

Check if the dataset is ready to be fit on.

Returns:

Whether the dataset is ready to be fit on.

Return type:

bool

property predict_ready : bool

Check if the dataset is ready to be predicted on.

Returns:

Whether the dataset is ready to be predicted on.

Return type:

bool

class tempor.data.dataset.OneOffTreatmentEffectsDataset(time_series: DataFrame | ndarray, *, targets: DataFrame | ndarray | None, treatments: DataFrame | ndarray, static: DataFrame | ndarray | None = None, **kwargs: Any)[source]

Bases: PredictiveDataset

A PredictiveDataset subclass for the one-off treatment effects problem setting, see BaseDataset docs.

In this setting: targets are required for fitting, will be initialized as TimeSeriesSamples; treatments are required for both fitting and prediction, will be initialized as EventSamples.

predictive : OneOffTreatmentEffectsTaskData
property fit_ready : bool

Check if the dataset is ready to be fit on.

Returns:

Whether the dataset is ready to be fit on.

Return type:

bool

property predict_ready : bool

Check if the dataset is ready to be predicted on.

Returns:

Whether the dataset is ready to be predicted on.

Return type:

bool

class tempor.data.dataset.TemporalTreatmentEffectsDataset(time_series: DataFrame | ndarray, *, targets: DataFrame | ndarray | None, treatments: DataFrame | ndarray, static: DataFrame | ndarray | None = None, **kwargs: Any)[source]

Bases: PredictiveDataset

A PredictiveDataset subclass for the temporal treatment effects problem setting, see BaseDataset docs.

In this setting: targets are required for fitting, will be initialized as TimeSeriesSamples; treatments are required for both fitting and prediction, will be initialized as TimeSeriesSamples.

predictive : TemporalTreatmentEffectsTaskData
property fit_ready : bool

Check if the dataset is ready to be fit on.

Returns:

Whether the dataset is ready to be fit on.

Return type:

bool

property predict_ready : bool

Check if the dataset is ready to be predicted on.

Returns:

Whether the dataset is ready to be predicted on.

Return type:

bool