{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    ""
   ]
  },
  {
   "cell_type": "markdown",
   "source": "# Data Wrangler Decorators Part 2: Advanced Decorators\n\nThis tutorial covers advanced decorator functionality in data-wrangler, including interpolation, stacking/unstacking operations, and building complex data processing pipelines.\n\n## Advanced Decorators Overview\n\nBeyond the basic `@funnel` decorator, data-wrangler provides specialized decorators for:\n\n- **`@interpolate`**: Automatic handling of missing data\n- **`@apply_stacked`**: Operations on stacked (melted) data\n- **`@apply_unstacked`**: Operations on unstacked (pivoted) data  \n- **Custom decorator combinations**: Chaining decorators for complex workflows\n\nThese decorators enable sophisticated data preprocessing pipelines with minimal code.",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "source": "import datawrangler as dw\nimport pandas as pd\nimport numpy as np\nfrom datawrangler.decorate import funnel, interpolate, apply_stacked, apply_unstacked\nimport matplotlib.pyplot as plt",
   "metadata": {},
   "outputs": [],
   "execution_count": null
  },
  {
   "cell_type": "markdown",
   "source": "## The @interpolate Decorator\n\nThe `@interpolate` decorator automatically handles missing data by applying interpolation methods before passing data to your function. This is particularly useful for time series analysis and data cleaning pipelines.",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "source": "# Create sample data with missing values\nnp.random.seed(42)\ndates = pd.date_range('2024-01-01', periods=20, freq='D')\nvalues = np.random.randn(20).cumsum()\n# Introduce some missing values\nvalues[5:8] = np.nan\nvalues[15] = np.nan\n\n# Create DataFrame with missing data\ntimeseries_data = pd.DataFrame({\n    'date': dates,\n    'value': values,\n    'category': ['A'] * 10 + ['B'] * 10\n})\n\nprint(\"Original data with missing values:\")\nprint(timeseries_data)\nprint(f\"\\nMissing values: {timeseries_data['value'].isna().sum()}\")",
   "metadata": {},
   "outputs": [],
   "execution_count": null
  },
  {
   "cell_type": "code",
   "source": "# Define a function that computes rolling statistics\n@funnel\n@interpolate(method='linear')\ndef compute_rolling_stats(data, window=5):\n    \\\"\\\"\\\"Compute rolling statistics on clean data\\\"\\\"\\\"\n    if 'value' not in data.columns:\n        return pd.DataFrame()\n    \n    result = pd.DataFrame({\n        'rolling_mean': data['value'].rolling(window=window).mean(),\n        'rolling_std': data['value'].rolling(window=window).std(),\n        'rolling_min': data['value'].rolling(window=window).min(),\n        'rolling_max': data['value'].rolling(window=window).max()\n    })\n    \n    return result\n\n# Apply to data with missing values - interpolation happens automatically\nrolling_stats = compute_rolling_stats(timeseries_data)\n\nprint(\"Rolling statistics computed on interpolated data:\")\nprint(rolling_stats.head(10))\n\n# Verify no missing values in the processed data\nprint(f\"\\nMissing values after interpolation: {rolling_stats.isna().sum().sum()}\")",
   "metadata": {},
   "outputs": [],
   "execution_count": null
  }
 ],
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}