CHRONOS CRC Score Discoveries Training Signal
Creative Reasoning Capacity

CRC Score

How frontier models rank when forced to think beyond their training data. Measured by CHRONOS across real sessions — not synthetic benchmarks.

Loading CRC scores...

How CRC is Calculated

Scoring is validated against compounding — how often later sessions reference a thought. Seven structural axes, continuously calibrated. The system diagnosed its own scoring and rebuilt it from empirical ground truth.

Compounding

Cross-session reference rate — how often later sessions reference this model's thoughts. The direct measurement of downstream utility.

Weight: 35%

Decorrelation

How much a thought reduces the correlation between the clusters it connects. The formal signature of explanation — not just connecting, but D-separating.

Weight: 30%

Bridge

Connection strength to multiple distinct clusters. Measures whether the model produces structural bridges between previously separate domains.

Weight: 20%

Efficiency

Novelty per token. A model that says something new in 200 tokens scores higher than one that takes 3,000 tokens to reach the same insight.

Weight: 10%

Novelty

Geometric distance from prior territory. Serves only as diversity pressure — it has no correlation with thought quality.

Weight: 5%