tempor.data.samples_experimental module¶
Module with experimental samples implementations.
- class tempor.data.samples_experimental.StaticSamplesDask(data: DataFrame | ndarray, **kwargs: Any)[source]¶
Bases:
StaticSamplesBaseCreate a
StaticSamplesDaskobject from thedata.- Parameters:¶
- data : data_typing.DataContainer¶
A container with the data.
- **kwargs : Any
Any additional keyword arguments to pass to the constructor.
- static from_dataframe(dataframe: DataFrame, **kwargs: Any) StaticSamplesDask[source]¶
Create
StaticSamplesDaskfrompandas.DataFrame. The rows represent samples, the columns represent features.
-
static from_numpy(array: ndarray, *, sample_index: list[int] | list[str] | None =
None, feature_index: list[str] | None =None, **kwargs: Any) StaticSamplesDask[source]¶ Not implemented yet.
- numpy(**kwargs: Any) ndarray[source]¶
Return the data as a
numpy.ndarray.- Parameters:¶
- **kwargs : Any
Any additional keyword arguments. Currently unused.
- Returns:¶
The
numpy.ndarray.- Return type:¶
np.ndarray
- dataframe(**kwargs: Any) DataFrame[source]¶
Return the data as a
pandas.DataFrame.
-
category : ClassVar[plugin_typing.PluginCategory] =
'static_samples'¶ Plugin category, such as
'prediction.one_off.classification'. Must be set by the plugin class using@register_plugin.
-
name : ClassVar[plugin_typing.PluginName] =
'static_samples_dask'¶ Plugin name, such as
'my_nn_classifier'. Must be set by the plugin class using@register_plugin.
-
plugin_type : ClassVar[plugin_typing.PluginTypeArg] =
'dataformat'¶ Plugin type, such as
'method'. May be optionally set by the plugin class using@register_plugin, else will set the default plugin type.
- tempor.data.samples_experimental.multiindex_df_to_compatible_ddf(df: DataFrame, **kwargs: Any) DataFrame[source]¶
Convert a multiindex dataframe to a dask dataframe with a single tuple index.
- tempor.data.samples_experimental.compatible_ddf_to_multiindex_df(ddf: DataFrame) DataFrame[source]¶
Convert a dask dataframe with a single tuple index to a multiindex dataframe.
- class tempor.data.samples_experimental.TimeSeriesSamplesDask(data: DataFrame | ndarray, **kwargs: Any)[source]¶
Bases:
TimeSeriesSamplesBaseCreate an
TimeSeriesSamplesDaskobject from thedata.- Parameters:¶
- data : data_typing.DataContainer¶
A container with the data.
- **kwargs : Any
Any additional keyword arguments to pass to the constructor.
- static from_dataframe(dataframe: DataFrame, **kwargs: Any) TimeSeriesSamplesDask[source]¶
Create
TimeSeriesSamplesDaskfrompandas.DataFrame. This row index of the dataframe should be a 2-level multiindex (sample, timestep). The columns should be the features.
- static from_numpy(array: ndarray, **kwargs: Any) TimeSeriesSamplesDask[source]¶
Not implemented yet.
-
numpy(*, padding_indicator: Any =
999.0, **kwargs: Any) ndarray[source]¶ Return the data as a
numpy.ndarray.- Parameters:¶
- padding_indicator : Any, optional¶
Padding indicator value. Defaults to
DATA_SETTINGS.default_padding_indicator.- **kwargs : Any
Any additional keyword arguments. Currently unused.
- Returns:¶
The
numpy.ndarray.- Return type:¶
np.ndarray
- dataframe(**kwargs: Any) DataFrame[source]¶
Return the data as a
pandas.DataFrame.- Parameters:¶
- **kwargs : Any
Any additional keyword arguments. Currently unused.
- Returns:¶
The
pandas.DataFrame.- Return type:¶
pd.DataFrame
- time_indexes() list[list[float]] | list[list[int]] | list[list[Timestamp]][source]¶
Get a list containing time indexes for each sample. Each time index is represented as a list of time step elements.
- time_indexes_as_dict() dict[int, list[float] | list[int] | list[Timestamp]] | dict[str, list[float] | list[int] | list[Timestamp]][source]¶
Get a dictionary mapping each sample index to its time index. Time index is represented as a list of time step elements.
- time_indexes_float() list[ndarray][source]¶
Return time indexes but converting their elements to
floatvalues.Date-time time index will be converted using
datetime_time_index_to_float.- Returns:¶
List of 1D
numpy.ndarrays offloatvalues, corresponding to the time index.- Return type:¶
List[np.ndarray]
- num_timesteps_as_dict() dict[int, int] | dict[str, int][source]¶
Get a dictionary mapping each sample index to its the number of timesteps.
- num_timesteps_equal() bool[source]¶
Returns
Trueif all samples share the same number of timesteps,Falseotherwise.
- list_of_dataframes() list[DataFrame][source]¶
Returns a list of dataframes where each dataframe has the data for each sample.
-
category : ClassVar[plugin_typing.PluginCategory] =
'time_series_samples'¶ Plugin category, such as
'prediction.one_off.classification'. Must be set by the plugin class using@register_plugin.
-
name : ClassVar[plugin_typing.PluginName] =
'time_series_samples_dask'¶ Plugin name, such as
'my_nn_classifier'. Must be set by the plugin class using@register_plugin.
-
plugin_type : ClassVar[plugin_typing.PluginTypeArg] =
'dataformat'¶ Plugin type, such as
'method'. May be optionally set by the plugin class using@register_plugin, else will set the default plugin type.
- class tempor.data.samples_experimental.EventSamplesDask(data: DataFrame | ndarray, **kwargs: Any)[source]¶
Bases:
EventSamplesBaseCreate an
EventSamplesDaskobject from thedata.- Parameters:¶
- data : data_typing.DataContainer¶
A container with the data.
- **kwargs : Any
Any additional keyword arguments to pass to the constructor.
- static from_dataframe(dataframe: DataFrame, **kwargs: Any) EventSamplesDask[source]¶
Create
EventSamplesfrompandas.DataFrame. The row index of the dataframe should be the sample indexes. The columns should be the features. Each feature should contain a tuple of(time, value)representing the event.- Parameters:¶
- dataframe : pd.DataFrame¶
The dataframe that contains the data.
- **kwargs : Any
Any additional keyword arguments to pass to the constructor.
- Returns:¶
The
EventSamplesDaskobject created from thedataframe.- Return type:¶
- static from_numpy(array: ndarray, **kwargs: Any) EventSamplesDask[source]¶
Not implemented yet.
- numpy(**kwargs: Any) ndarray[source]¶
Return the data as a
numpy.ndarray.- Parameters:¶
- **kwargs : Any
Any additional keyword arguments. Currently unused.
- Returns:¶
The
numpy.ndarray.- Return type:¶
np.ndarray
- dataframe(**kwargs: Any) DataFrame[source]¶
Return the data as a
pandas.DataFrame.- Parameters:¶
- **kwargs : Any
Any additional keyword arguments. Currently unused.
- Returns:¶
The
pandas.DataFrame.- Return type:¶
pd.DataFrame
-
category : ClassVar[plugin_typing.PluginCategory] =
'event_samples'¶ Plugin category, such as
'prediction.one_off.classification'. Must be set by the plugin class using@register_plugin.
-
name : ClassVar[plugin_typing.PluginName] =
'event_samples_dask'¶ Plugin name, such as
'my_nn_classifier'. Must be set by the plugin class using@register_plugin.
-
plugin_type : ClassVar[plugin_typing.PluginTypeArg] =
'dataformat'¶ Plugin type, such as
'method'. May be optionally set by the plugin class using@register_plugin, else will set the default plugin type.
-
split(time_feature_suffix: str =
'_time') DataFrame[source]¶ Return a
pandas.DataFramewhere the time component of each event feature has been split off to its own column. The new columns that contain the times will be named"<original column name><time_feature_suffix>"and will be inserted before each corresponding<original column name>column. The<original column name>columns will contain only the event value.
-
split_as_two_dataframes(time_feature_suffix: str =
'_time') tuple[DataFrame, DataFrame][source]¶ - Analogous to
split()but returns twopandas.DataFrames:
- Analogous to