DataOfficialVerified

Experiment Tracking

by Skill Me

Set up disciplined ML experiment tracking for reproducibility and comparison. Use when starting a new model project, onboarding a team, or when results cannot be reproduced.

mlopsreproducibilityexperiment-trackingversioningml

Demo

Click to play with sound.

SKILL.md preview

View source on GitHub →

---
name: Experiment Tracking
description: Establishes a disciplined experiment tracking setup covering logging, artifact versioning, and reproducibility standards. Apply when bootstrapping a new ML project, onboarding a team to a tracking tool, or investigating why a result cannot be reproduced.
---

# Experiment Tracking

An experiment no one can reproduce is a result no one can trust. Disciplined tracking is not overhead — it is the minimum viable scientific practice for ML.

## 1. What to Log on Every Run

Inconsistent logging is as bad as no logging.

- Hyperparameters: every value, including defaults — do not rely on code to reconstruct them
- Dataset: name, version or hash, split sizes, and any sampling applied
- Code: git commit SHA; fail the run if the working tree is dirty unless explicitly allowed
- Environment: Python version, key library versions (framework, numpy, pandas minimum)
- Metrics: train, validation, and test values; log per-epoch curves for iterative models
- Artifacts: model checkpoint path, preprocessor path, and evaluation report path

## 2. Experiment Naming and Organization

Naming is the index — garbage names make the tracker useless.

- Format: <project>/<hypothesis>/<variant> (e.g., churn/feature-selection/drop-low-variance)… install to load the full skill

Experiment Tracking

Demo

SKILL.md preview

Reviews

Write a review