If the reversible kernel cannot mix at convergence, the natural next question is whether it ever could — so we walked the same trained trajectory backward in time and measured at four checkpoints.
The question
Exp 4 established that on a converged 60_12 DTM the reversible negative-phase kernel does not equilibrate: out to large lattice depth, so the integrated autocorrelation time required by assumption is never finite. The open follow-up: is non-equilibration a property of the converged model only — a deep-checkpoint pathology that earlier, less-trained models might escape — or is it fundamental on this substrate, present from the start? That distinguishes the two horns of the spine's "fundamentality open" question.
The setup
One cumulative training trajectory (single-input 60_12 DTM-MNIST, seed 0, N_CHAINS=32, vanilla ACP-off, reversible patch live throughout), probed at four checkpoints epochs. Each probe runs exp4's non-circular doubling-stability rule (half-Sokal , TAU_TOL=0.15, SOKAL_C=5) under a per-checkpoint ceiling P0_CKPT_CEILING_H=0.75 GPU-h. Measurement-only: the harness writes sweep_calibrate.json and never a budget or verdict. Substrate Lightning H200; run COMPLETE in 5.825 h. See experiments/exp6-checkpoint-tau-sweep/.
Two provenance invariants are PROVEN, not asserted:
- Cumulative trajectory (
cumulative_training_proven=True):opt_countincrements ( batches), monotone, weights-hash distinct per checkpoint, LR advances (t=25) floor (t≥50), never re-ramped. - Probe-RNG isolation (
probe_rng_isolated_proven=True): thekey_timelineshowsdtm.keyidentical before/after every probe, advancing only across train chunks. The interleaved probe consumed a probe-localjr.PRNGKey(SEED)chain and never touched the training stream — which is what makes the ladder bit-equivalent to a singletrain(200)and the t=200 anchor interpretable.
The result
does not equilibrate at accessible scale at any checkpoint. Three of four grow out to with no resolution:
| | LR | (frozen rule) | curve | |---|---|---|---| | 25 | 0.030 | 2094 (stabilized) | to , softening (0.127 at 16k), single flat step at ; probe stopped at | | 50 | 0.010 | UNRESOLVED | to 64k, , | | 100 | 0.010 | UNRESOLVED | to 64k, , | | 200 | 0.010 | UNRESOLVED | to 64k, , |
For t=50/100/200 the proximate stop was the 0.75 h ceiling firing (tau_unresolved_reason="ckpt_ceiling"); the doubling rule had not resolved anyway (e.g. t=50 reldiff , far above TAU_TOL). t=200 reproduces exp4's doubling probe ( to ; exp4 , exp6 there — within noise): the reproduction anchor is satisfied.
The conservative scientific reading is registered outcome (i), "slow from the start" — at all four checkpoints. The operational antagonism is present from very early training (t=25, pre-LR-floor), not only at convergence.
Scope and caveats
This is where the entry earns its keep. The frozen rule's literal output is not outcome (i) — it is outcome (ii) (crossover): by the letter of the rule, t=25 resolved to (reldiff and both hold at ). So outcome (i)'s own literal precondition — "UNRESOLVED at all core checkpoints incl. t=25" — is false per the rule. We reach (i) only after a claim-precision audit reclassifies t=25's resolution as a windowing/finite-sample artifact:
- (a) t=25 grew like the others through , then "stabilized" on a single sub-tolerance step at ( collapsing to 0.065) — the very range where t=50/100/200 are still climbing;
- (b) t=25's probe stopped at , one doubling shorter than the others, so we never saw whether it would resume growth at as the others did.
Those two are the decisive caveats. Conservative verdict: t=25 is a crossover candidate, not a confirmed finite- regime. No threshold was relaxed; the rule's output is reported verbatim alongside the audit.
Even taking t=25 at face value, it fails the A6-vs-utility test: the only rule-resolved checkpoint is the least-trained model, and its Phase-2 window would need steps. A finite- reversible regime, if it exists here at all, coincides with a barely-useful model — the antagonism is not escaped at any useful checkpoint.
Phase 2 did not run. The gate needs three conjunctive conditions. Condition 2 (MEM_SAFE steps ) would actually pass — the reach was a wall-clock-ceiling artifact, not a memory limit. The dispositive failure is condition 1 (no robust finite , per the audit); condition 3 (no declared DECISION: PROCEED) is a procedural backstop. checkpoint_decision.md stays PENDING.
No tag flip. Conditional factorization stays [solid]; the operational claim stays [conjectured], now with at-scale real-chain evidence. This sharpens Risk 5 (the gate) and Risk 1 (at-scale tracking) — both stay [open] — and strengthens, does not close, the named structural-obstruction reading toward the fundamental-on-this-substrate side. Single graph, single trajectory, one unconfirmed t=25 candidate: the spine's "fundamentality open" stands.
A budget honesty note: the pre-commitment self-bounded at GPU-h; actual was 5.825 h. Cause is per-doubling JIT/re-trace overhead dominating the (tiny, s total) sampling sweeps, letting one extra doubling complete past the ceiling check. No constant was relaxed, and the extra length only gave the chains longer to equilibrate — which they still didn't, strengthening the UNRESOLVED verdicts.
What this feeds: a confirm/refute of the t=25 crossover candidate needs a longer t=25 probe () and/or more chains — registered as a follow-up, not claimed here; this entry leaves the obstruction pushed toward, but not settled as, fundamental-on-this-substrate.