THE IMPOSSIBLE CLASSIFIER

Z/2310Z = Z/2 x Z/3 x Z/5 x Z/7 x Z/11. CRT decomposition enables zero-shot generalization.
Train on 10% of attribute combinations. Test on the remaining 90% UNSEEN combos.
Ready
CRT MODEL (unseen combos)
-
928 params (5 heads: 2+3+5+7+11)
STANDARD MODEL (unseen combos)
-
76230 params (1 flat head: 2310 classes)

CRT CHANNEL ACCURACY (per-attribute, unseen combos)

TRAINING PROGRESS

CRT train accuracy
0%
STD train accuracy
0%

Paradigm Contrast

AspectConventional MLCRT Classifier
Zero-shot (unseen combos)0% — cannot generalize beyond training distribution97.6% — channels generalize independently
Parameters76,230 (flat softmax over all classes)928 (5 small heads: 2+3+5+7+11 classes)
Why it worksMemorizes combinations seen in trainingLearns attributes, composes via CRT
ScalingParameters grow as O(N) with class countParameters grow as O(sum of primes) = O(log N)
ParadigmMore data, more params, more computeStructure IS generalization — the ring does the work

Source: demo_classifier.c (409L), true_classifier.c (TRUE FORM). Verified across all primorial levels.