Design reusable, leakage-safe features and a feature store schema. Use when building ML infrastructure, sharing features across models, or preventing training-serving skew.
Click to play with sound.
---
name: Feature Store Design
description: Guides the design of a reusable, leakage-safe feature store including entity modeling, transformation versioning, and point-in-time correctness. Apply when sharing features across models, building ML platform infrastructure, or diagnosing training-serving skew.
---
# Feature Store Design
Feature stores exist to solve three problems at once: reuse across models, point-in-time correctness during training, and consistency between training and serving. A design that solves only one of these is incomplete.
## 1. Entity and Feature Naming Conventions
Consistent naming is load-bearing — it is the API contract for every downstream model.
- Entity format: <entity-type>_id (e.g., user_id, item_id, session_id)
- Feature format: <entity-type>__<source>__<transformation>__<window> (e.g., user__payments__sum__30d)
- Avoid abbreviations not defined in a project glossary
- Each feature must have exactly one owning team or owner string in metadata
## 2. Point-in-Time Correctness
The most common feature store bug is silent label leakage from future data.
- Every feature retrieval must accept an as-of timestamp parameter
- Offline training joins must use point-in-time correct lookups — never a naive left join on entity ID
- Audit any feature derived from a slowly-changing dimension for the correct SCD type… install to load the full skillSign in to rate and review this skill.
No reviews yet. Be the first to review this skill.