{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "" ] }, { "cell_type": "markdown", "source": "# Data Wrangler Utilities\n\nThis tutorial covers the utility functions in `datawrangler.util` that help with data type detection, validation, and manipulation. These utilities are the building blocks that power data-wrangler's automatic data type detection.\n\n## Overview\n\nThe `datawrangler.util` module provides essential helper functions:\n\n- **`dataframe_like()`**: Check if an object behaves like a DataFrame\n- **`array_like()`**: Detect array-like objects\n- **`depth()`**: Determine nesting depth of data structures\n- **`btwn()`**: Check if values fall within a range\n\nThese utilities are particularly useful when building custom data processing pipelines or extending data-wrangler's functionality.", "metadata": {} }, { "cell_type": "code", "source": "import datawrangler as dw\nfrom datawrangler.util import dataframe_like, array_like, depth, btwn\nimport pandas as pd\nimport numpy as np", "metadata": {}, "outputs": [], "execution_count": null }, { "cell_type": "markdown", "source": "## Data Type Detection\n\nUnderstanding how data-wrangler detects different data types is crucial for building robust data processing pipelines. Let's explore the detection utilities:", "metadata": {} }, { "cell_type": "code", "source": "# Test different data types with detection utilities\ntest_objects = [\n pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}), # True DataFrame\n {'A': [1, 2, 3], 'B': [4, 5, 6]}, # Dict (not DataFrame-like)\n np.array([[1, 2, 3], [4, 5, 6]]), # NumPy array\n [[1, 2, 3], [4, 5, 6]], # Nested list\n [1, 2, 3, 4, 5], # Simple list\n \\\"Hello World\\\", # String\n 42 # Number\n]\n\nobject_names = [\n \\\"pandas DataFrame\\\",\n \\\"Dictionary\\\", \n \\\"NumPy Array\\\",\n \\\"Nested List\\\",\n \\\"Simple List\\\",\n \\\"String\\\",\n \\\"Number\\\"\n]\n\nprint(\\\"=== Data Type Detection Results ===\\\")\nprint(f\\\"{'Object Type':<20} {'DataFrame-like':<15} {'Array-like':<12} {'Depth':<8}\\\")\nprint(\\\"-\\\" * 60)\n\nfor obj, name in zip(test_objects, object_names):\n is_df_like = dataframe_like(obj)\n is_array_like = array_like(obj)\n obj_depth = depth(obj)\n \n print(f\\\"{name:<20} {str(is_df_like):<15} {str(is_array_like):<12} {obj_depth:<8}\\\")", "metadata": {}, "outputs": [], "execution_count": null } ], "metadata": { "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }