datawrangler.zoo.dataframe

datawrangler.zoo.dataframe.is_dataframe(x)[source]

Determine if an object (or file) is a DataFrame (pandas or Polars)

Parameters

param x:

the object (or a file path)

Returns

return:

True if the object is a DataFrame (pandas or Polars) or points to a file that can be loaded as a DataFrame,

and False otherwise.

datawrangler.zoo.dataframe.is_multiindex_dataframe(x)[source]

Determine if an object (or file) is a MultiIndex DataFrame– i.e., a DataFrame with a multi-level index

Parameters

param x:

the object (or file path)

Returns

return:

True if the object is a MultiIndex DataFrame (or points to a file that can be loaded as a

MultiIndex DataFrame), and False otherwise.

datawrangler.zoo.dataframe.wrangle_dataframe(data, return_model=False, backend=None, **kwargs)[source]

Turn a (potentially messy) DataFrame into a (potentially cleaner) DataFrame

Parameters

param data:

a DataFrame (pandas or Polars), dataframe-like object, or a file path that points to a file that can be loaded as a DataFrame

param return_model:

if True, return a function for turning the (“messy”) DataFrame into a “clean” DataFrame, along with the cleaned DataFrame. Otherwise (if False), just return the cleaned DataFrame. Default: False

param backend:

str, optional The DataFrame backend to use (‘pandas’ or ‘polars’). If None, preserves the input type

param kwargs:

passed to the DataFrame “wrangling” model (default: the constructor for pandas.DataFrame or polars.DataFrame)

Returns

return:

The “wrangled” DataFrame (if return_model is False), or the DataFrame plus a “model” for cleaning DataFrames (if return_model is True).

Examples

>>> import pandas as pd
>>> import datawrangler as dw
>>> # Wrangle pandas DataFrame, preserving type
>>> df_pandas = pd.DataFrame({'A': [1, 2, 3]})
>>> cleaned_pandas = dw.wrangle(df_pandas)
>>> # Convert pandas DataFrame to Polars
>>> df_polars = dw.wrangle(df_pandas, backend='polars')
>>> # Load and wrangle from file
>>> df_from_file = dw.wrangle('data.csv')