Run a structured exploratory data analysis before modeling. Use when starting a new dataset, debugging model failures, or auditing data quality.
Click to play with sound.
---
name: EDA Playbook
description: Guides an analyst through a structured exploratory data analysis before any modeling begins. Apply when receiving a new dataset, debugging unexpected model behavior, or auditing data quality upstream.
---
# EDA Playbook
Skipping EDA before modeling is the single most common source of silent failures in ML pipelines. This skill establishes a repeatable, thorough checklist that surfaces problems early.
## 1. Shape and Schema Audit
Before anything else, verify the dataset dimensions and types match expectations.
- Confirm row count is in the expected range; flag if orders of magnitude off
- Check every column dtype; coerce-or-drop mismatches before downstream steps
- List columns with any null values and their null rates
- Flag columns with null rate above 20% for explicit handling decisions
## 2. Target Variable Analysis
Understand the label distribution before touching features.
- For classification: compute class frequencies and imbalance ratio
- For regression: plot the full distribution; check skew and kurtosis
- Identify any label values that are implausible or out of range… install to load the full skillSign in to rate and review this skill.
No reviews yet. Be the first to review this skill.