datawrangler.zoo.dataframe
- datawrangler.zoo.dataframe.is_dataframe(x)[source]
Determine if an object (or file) is a DataFrame (pandas or Polars)
Parameters
- param x:
the object (or a file path)
Returns
- return:
True if the object is a DataFrame (pandas or Polars) or points to a file that can be loaded as a DataFrame,
and False otherwise.
- datawrangler.zoo.dataframe.is_multiindex_dataframe(x)[source]
Determine if an object (or file) is a MultiIndex DataFrame– i.e., a DataFrame with a multi-level index
Parameters
- param x:
the object (or file path)
Returns
- return:
True if the object is a MultiIndex DataFrame (or points to a file that can be loaded as a
MultiIndex DataFrame), and False otherwise.
- datawrangler.zoo.dataframe.wrangle_dataframe(data, return_model=False, backend=None, **kwargs)[source]
Turn a (potentially messy) DataFrame into a (potentially cleaner) DataFrame
Parameters
- param data:
a DataFrame (pandas or Polars), dataframe-like object, or a file path that points to a file that can be loaded as a DataFrame
- param return_model:
if True, return a function for turning the (“messy”) DataFrame into a “clean” DataFrame, along with the cleaned DataFrame. Otherwise (if False), just return the cleaned DataFrame. Default: False
- param backend:
str, optional The DataFrame backend to use (‘pandas’ or ‘polars’). If None, preserves the input type
- param kwargs:
passed to the DataFrame “wrangling” model (default: the constructor for pandas.DataFrame or polars.DataFrame)
Returns
- return:
The “wrangled” DataFrame (if return_model is False), or the DataFrame plus a “model” for cleaning DataFrames (if return_model is True).
Examples
>>> import pandas as pd >>> import datawrangler as dw >>> # Wrangle pandas DataFrame, preserving type >>> df_pandas = pd.DataFrame({'A': [1, 2, 3]}) >>> cleaned_pandas = dw.wrangle(df_pandas) >>> # Convert pandas DataFrame to Polars >>> df_polars = dw.wrangle(df_pandas, backend='polars') >>> # Load and wrangle from file >>> df_from_file = dw.wrangle('data.csv')