datawrangler.decorate
- datawrangler.decorate.list_generalizer(f)[source]
A decorator that makes a function work for either a single object or a list of objects by calling the function on each element
Parameters
Returns
- return:
A decorated function that supports lists of data objects (rather than only non-list data objects)
- datawrangler.decorate.funnel(f)[source]
A decorator that coerces any data passed into the function into a DataFrame (pandas or Polars) or a list of DataFrames
Parameters
Returns
- return:
A decorated function that supports any wrangle-able data format. The decorated function accepts an optional ‘backend’ keyword argument (‘pandas’ or ‘polars’) to specify the DataFrame backend.
Notes
The decorated function can be called with: - backend=’pandas’: Convert inputs to pandas DataFrames (default) - backend=’polars’: Convert inputs to Polars DataFrames for better performance
- datawrangler.decorate.interpolate(f)[source]
A decorator that fills in missing data by imputing and/or interpolating missing values
Parameters
Returns
- return:
A decorated function that supports any wrangle-able datatype. Pass in the following keyword arguments to
- fill in missing data:
- backend: str, optional (‘pandas’ or ‘polars’)
Specify the DataFrame backend. If not provided, preserves input backend.
- interp_kwargs: a dictionary containing interpolation/imputation parameters:
- impute_kwargs: a dictionary containing one or more scikit-learn imputation models (e.g.,
{‘model’: ‘IterativeImputer’}. The ‘model’ can be specified as defined in the apply_sklearn_model function.
- Any other keywords are passed to the DataFrame’s interpolate method; e.g. method=’linear’ will apply linear
interpolation to fill in missing values. For pandas DataFrames, supported arguments are documented at: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html
Notes
Backend-specific behavior: - pandas: Full interpolation support with all pandas.DataFrame.interpolate() methods - Polars: Limited interpolation support; automatically converts to pandas for interpolation, then back to Polars If no interpolation arguments are specified, no interpolation is performed.
- datawrangler.decorate.apply_stacked(f)[source]
- Decorate a function to adjust how it handles data as follows:
Wrangle the data into DataFrames (the resulting DataFrames must all have the same number of columns). MultiIndex DataFrames are also supported (and can represent already-stacked datasets)
Vertically concatenate the wrangled data
Apply the function to the “stacked” dataset, treating the combined data as a “single” DataFrame
If the original dataset was provided in “unstacked” format, unstack the result into a list of DataFrames
Return the resulting (stacked or unstacked) DataFrame(s)
Parameters
Returns
- return:
a decorated function that supports any wrangle-able data types, applies the original function to the full
list of datasets simultaneously, and then returns the result(s) as a new DataFrame or list of DataFrames.
- datawrangler.decorate.apply_unstacked(f)[source]
- Decorate a function to adjust how it handles data as follows:
Wrangle the data into a list of DataFrames. MultiIndex DataFrames are also supported (and can represent stacked datasets)
Apply the function (individually) to each DataFrame in the resulting list
If the original dataset was provided in “stacked” format, stack the result into a MultiIndex DataFrame
Return the resulting (stacked or unstacked) DataFrame(s)
Parameters
Returns
- return:
A decorated function that supports any wrangle-able data types, applies the original function to the full
list of datasets separately, and then returns the result(s) as a new DataFrame or list of DataFrames.