OSS maintenance triage, research metrics, and live repository overview
ML Evaluation Workspace
Keep the model story clear: one page for the signal, one for the repos, one for the artifacts.
Overview stays focused on AUROC, Brier, inactivity rate, and calibration. Dataset, Repositories, and Runs hold the deeper inspection views so the workflow stays readable.
Overview
Live quality snapshot
Dataset
Coverage and feature inventory
Repos
Stars, notes, and activity
Runs
Cached artifacts and splits
Latest training result without the dashboard sprawl.
Trigger training here, keep the key metrics above the fold, and push data inspection and artifact history into their own pages.
Training base
0 snapshots
0 repositories in the current base
Dataset hash
Pending
No cached artifact yet
Time-aware split
Pending
Split appears after the first completed run
Latest Artifact
What changed in the current training picture
Model
Waiting for first run
Observed window
Waiting for first completed artifact
Labeled rows
0 labeled / 0 total
Feature count
0 features in the latest artifact
Trigger the first run to cache a live evaluation artifact and populate the dataset and run-history pages.
Quality
Pending
Combined held-out score from AUROC skill and Brier skill.
AUROC
Pending
Ranking quality on the held-out evaluation slice.
Brier
Pending
Calibration-sensitive probability error. Lower is better.
Inactive 12m rate
Pending
Positive-label pressure in the current held-out slice.
F1
Pending
Thresholded balance of precision and recall.
Precision
Pending
How often predicted inactivity is correct.
Recall
Pending
How much true inactivity the model is catching.
Log loss
Pending
Penalty for overconfident wrong probabilities.
Calibration
No calibration artifact yet
Once a completed run produces evaluation bins, the reliability curve will render here from the cached artifact.
Metric Guide
Read the top-line metrics without leaving the page
Quality
AUROC
Brier score
Inactive 12m rate
F1
Precision
Recall
Calibration
Metric history
No cached run history yet
Once you have more than one cached training run with metrics, the run-history trend chart will appear here.