tempor.data.pandera_utils module¶
Utilities for pandera validation.
- tempor.data.pandera_utils.update_schema(schema: DataFrameSchema, **kwargs: Any) DataFrameSchema[source]¶
Update a pandera dataframe schema with
kwargs.
- tempor.data.pandera_utils.update_index(index: Index, **kwargs: Any) Index[source]¶
Update a pandera index with
kwargs.
- tempor.data.pandera_utils.update_multiindex(multi_index: MultiIndex, **kwargs: Any) MultiIndex[source]¶
Update a pandera multiindex with
kwargs.
-
tempor.data.pandera_utils.PA_DTYPE_MAP : dict[type | Literal[category] | Literal[datetime], DataType] =
{<class 'bool'>: DataType(bool), <class 'int'>: DataType(int), <class 'float'>: DataType(float), <class 'str'>: DataType(string), 'category': DataType(category), 'datetime': DataType(timestamp)}¶ A mapping from dtype specified as
Dtypeto apandera.DataType.
- tempor.data.pandera_utils.get_pa_dtypes(dtypes: Iterable[type | Literal[category] | Literal[datetime] | DataType]) list[DataType][source]¶
Return a list of
pandera.DataTypecorresponding todtypes. RaisesKeyErrorIf not found.
- class tempor.data.pandera_utils.UnionDtype(dtype: Any)[source]¶
Bases:
DataTypeExtend
panderaDataTypes with a customUnionDtype, which will function similarly toUnion.See
panderaDataType[guide](https://pandera.readthedocs.io/en/stable/dtypes.html) for details.In this case, rather than wrapping the extension
DataTypewithregister_dtypeandimmutabledecorators, we apply these directly to the class returned by__class_getitem__, which dynamically creates the union specified with its dtypes. In this way,pandera’spandasengine correctly registers each new kind of union as a different dtype.-
check(pandera_dtype: DataType, data_container: Any | None =
None) bool | Iterable[bool][source]¶ Checks whether the
pandera_dtypeand optionallydata_containersatisfy at least one the union’sunion_dtypes.
- coerce(data_container: Any) NoReturn[source]¶
The
coercemethod is not supported and will throw aNotImplementedError.
-
check(pandera_dtype: DataType, data_container: Any | None =
- tempor.data.pandera_utils.init_schema(data: DataFrame, **kwargs: Any) DataFrameSchema[source]¶
Initialize a
pandera.DataFrameSchemafromdatausingpandera.infer_schema.
- tempor.data.pandera_utils.add_df_checks(schema: DataFrameSchema, *, checks_list: list[Check]) DataFrameSchema[source]¶
Update
schemawithpanderachecks specified inchecks_list.
-
tempor.data.pandera_utils.add_regex_column_checks(schema: DataFrameSchema, *, regex: str =
'.*', dtype: Any, nullable: bool, checks_list: list[Check] | None =None) DataFrameSchema[source]¶ Update
schemawith checks specified inchecks_list, applied to all columns specified byregex.dtypeandnullablecan also be specified and will apply to all columns.
-
tempor.data.pandera_utils.set_up_index(schema: DataFrameSchema, data: DataFrame, *, dtype: Any, name: str, nullable: bool, unique: bool, coerce: bool, checks_list: list[Check] | None =
None) tuple[DataFrameSchema, DataFrame][source]¶ Update
schema.index(pandera.Index) withdtype,name,nullable, … schema settings.In addition, set the index name of
data(pandas.DataFrame) toname.Returns the schema and the dataframe.
-
tempor.data.pandera_utils.set_up_2level_multiindex(schema: DataFrameSchema, data: DataFrame, *, dtypes: tuple[Any, Any], names: tuple[str, str], nullable: tuple[bool, bool], coerce: bool, unique: tuple[str, ...], checks_list: tuple[list[Check], list[Check]] | None =
None) tuple[DataFrameSchema, DataFrame][source]¶ Update
schema.index(pandera.MultiIndex), which is expected to have 2 levels, withdtypes``,names,nullable, … schema settings.In addition, set the index name of
data(pandas.DataFrame) toname.Returns the schema and the dataframe.
- class tempor.data.pandera_utils.checks[source]¶
Bases:
objectNamespace containing reusable
pandera.Checks.-
forbid_multiindex_index =
<Check <lambda>: MultiIndex Index not allowed>¶
-
forbid_multiindex_columns =
<Check <lambda>: MultiIndex Columns not allowed>¶
-
require_2level_multiindex_index =
<Check <lambda>: Index must be a MultiIndex with 2 levels>¶
-
require_element_len_2 =
<Check <lambda>: Each item must contain a sequence of length 2>¶
- class configurable[source]¶
Bases:
objectNamespace containing functions to get configurable
pandera.Checks.
-
forbid_multiindex_index =