Pandas Work

Pandas Backtesting Framework Pandas Backtesting Framework
🔄
RESHAPING & PIVOTING
Your Biggest Gap - Priority #1
  • .pivot() - Long to wide format
  • .pivot_table() - Pivot with aggregation
  • .melt() - Wide to long transformation
  • .stack() / .unstack() - Index manipulation
  • .crosstab() - Cross-tabulation tables
  • .transpose() / .T - Matrix transposition
  • .explode() - Expand list-like columns
  • .concat() - Concatenate DataFrames
  • .merge() - SQL-style joins
  • .join() - Index-based joining
ADVANCED GROUPBY
Build Beyond Basic .groupby()
  • .groupby().agg(custom_function) - Complex ops
  • .groupby().apply(col='sum') - Multi agg
  • .groupby().transform() - Broadcast back
  • .groupby().filter() - Group filtering
  • .groupby().rolling() - Rolling within groups
  • .groupby().nth() - Get nth row per group
  • .groupby().head() / .tail() - First/last n rows
  • .groupby().size() vs .count() - Differences
  • .sum() / .mean() / .std() - Basic aggregation
  • .value_counts() - Frequency analysis
🔤
STRING OPERATIONS
Almost Never Used - Big Gap
  • .str.extract() - Regex pattern extraction
  • .str.split() + .str.get() - String parsing
  • .str.replace() - Regex replacements
  • .str.contains() - Complex patterns
  • .str.findall() - Find all matches
  • .fillna() / .dropna() - Handle missing values
  • .duplicated() / .drop_duplicates() - Clean duplicates
  • .isna() / .notna() - Missing data detection
  • .drop() - Remove columns/rows
  • .rename() - Rename columns/index
🎯
ADVANCED INDEXING
Beyond Basic .loc[] and .iloc[]
  • .xs() - Cross-section from MultiIndex
  • .query() - SQL-like filtering syntax
  • .eval() - Expression evaluation
  • .where() / .mask() - Conditional selection
  • MultiIndex operations
  • .loc[] / .iloc[] - Label/position indexing
  • .at[] / .iat[] - Single value access
  • .set_index() / .reset_index() - Index manipulation
  • .reindex() - Conform to new index
  • .sort_values() / .sort_index() - Sorting
🏷️
DATA TYPES & CATEGORICAL
Memory Optimization & Binning
  • astype('category') - Categorical data
  • .cut() / .qcut() - Binning continuous data
  • pd.get_dummies() - One-hot encoding
  • .factorize() - Convert to numeric codes
  • Categorical operations (.cat accessor)
  • .select_dtypes() - Filter columns by type
  • .convert_dtypes() - Auto type inference
  • .info() / .describe() - Data overview
  • .dtypes / .shape - Basic properties
  • .memory_usage() - Memory profiling
ADVANCED TIME SERIES
Beyond .shift() and .diff()
  • .resample() - Time-based grouping
  • .rolling() - Custom windows
  • .expanding() - Expanding windows
  • pd.date_range() - Complex frequencies
  • Advanced datetime indexing
  • .tz_localize() / .tz_convert() - Timezone handling
  • .dt accessor methods - Date/time operations
  • .interpolate() - Smart missing value filling
  • .head() / .tail() / .sample() - Data sampling
  • .corr() / .cov() - Statistical relationships