datawrangler.io#
- datawrangler.io.load(x, dtype=None, **kwargs)[source]#
Load local or remote files in a wide range of formats
- Parameters
x – a string containing a URL or file path
dtype –
Optional argument for specifying how the data should be loaded; can be one of: - ‘pickle’: use the dill library to load in pickled objects and functions - ‘numpy’: treat the dataset as a .npy or .npz file - None (default): attempt to determine the filetype automatically based on the URL or file extension. The
- following filetypes are supported:
txt files: treated as plain text
any filetype supported by the Pandas library: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html
any image filetype supported by PIL; for a full list see: https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html
kwargs – any additional keyword arguments are passed to whatever function is selected to load in the dataset. For example, when loading in a csv file (a Pandas-compatible format), passing the keyword argument index_col=0 will tell Pandas to interpret the first (0) column as the resulting DataFrame’s index when loading the file’s contents into a DataFrame.
- Returns
the retrieved data. Remote files will be cached (saved) locally to disk for faster loading if/when the
same address is used to load the file again at a later time.
- datawrangler.io.save(x, obj, dtype=None, **kwargs)[source]#
Save data to disk.
- Parameters
x – the file’s original path or URL (used to create a hash to define a new filename)
obj – the data to store to disk
dtype –
optional argument specifying how to store the data; can be one of: - ‘pickle’: use the dill library to pickle the object - ‘numpy’: save the objects as a compressed (.npz-formatted) numpy file - None (default): determine the filetype automatically; if x is passed in as bytes, write x directly to disk. If
x is a string, treat x as text.
kwargs – any additional keyword arguments are passed to dill.dump (if dtype == ‘pickle’) or numpy.savez (if dtype == ‘numpy’). For any other datatype, additional keyword arguments are ignored.
- Returns
None