DataOfficialVerified

Feature Store Design

by Skill Me

Design reusable, leakage-safe features and a feature store schema. Use when building ML infrastructure, sharing features across models, or preventing training-serving skew.

feature-storefeature-engineeringmlopsleakagedata-pipeline

Demo

Click to play with sound.

SKILL.md preview

View source on GitHub →

---
name: Feature Store Design
description: Guides the design of a reusable, leakage-safe feature store including entity modeling, transformation versioning, and point-in-time correctness. Apply when sharing features across models, building ML platform infrastructure, or diagnosing training-serving skew.
---

# Feature Store Design

Feature stores exist to solve three problems at once: reuse across models, point-in-time correctness during training, and consistency between training and serving. A design that solves only one of these is incomplete.

## 1. Entity and Feature Naming Conventions

Consistent naming is load-bearing — it is the API contract for every downstream model.

- Entity format: <entity-type>_id (e.g., user_id, item_id, session_id)
- Feature format: <entity-type>__<source>__<transformation>__<window> (e.g., user__payments__sum__30d)
- Avoid abbreviations not defined in a project glossary
- Each feature must have exactly one owning team or owner string in metadata

## 2. Point-in-Time Correctness

The most common feature store bug is silent label leakage from future data.

- Every feature retrieval must accept an as-of timestamp parameter
- Offline training joins must use point-in-time correct lookups — never a naive left join on entity ID
- Audit any feature derived from a slowly-changing dimension for the correct SCD type… install to load the full skill

Feature Store Design

Demo

SKILL.md preview

Reviews

Write a review