A hotter top cures the temperature obstruction exp18 left open, but the cheapest decorrelating ladder costs more rungs than we are allowed — so the wall moves from overlap to thermodynamic length.
This is the complete technical record for experiments/exp19-hotter-top-first/. Here we keep the grid, the gates, and the claim-status discipline. Outcome HOT-TOP-DECORRELATES-COSTWALL. MEASURE-ONLY — no tag moves.
The question
exp18 left two suspects for why a reversible parallel-tempering ladder would not mix the trained DTM: a hot end that did not decorrelate, and mis-placement of the rungs. exp19 asks both at once on the corrected (trained-weight-refreshed) 60_12, MNIST DTM at INPUT_IDX=0, single-input conditional : does a hotter top (, i.e. ) make the single-replica reversible 4-block-Gibbs kernel decorrelate (), and if so, can an equal-acceptance ladder span the decorrelating regime within the conferred budget rungs?
The setup
Two analytic-plus-empirical frontiers over a 10-point grid floored at . (a) A decorrelation frontier: single-replica integrated autocorrelation at flat (Sokal-resolved throughout). (b) An analytic cost frontier: with — the rung count an equal-acceptance ladder needs to span . Gated before any scored compute, all PASS: patch_live=True with the A2 self-adjoint cert PASS (-reversible to ); provenance proven (opt_counts=[12200,12200], weights_hashinit_hash); and the trained-weight refresh guard that invalidated exp15/16/17 now PROVEN clean (constructor_was_stale=True, refreshed_vs_trained=0.0, refresh_ok=True) — so this is a valid trained read, not the stale-INIT bug. MEASURED at runtime . Ran on a rented H200, SSH-driven, of the GPU-h cap.
The result
A hotter top DOES cure the decorrelation — a temperature effect, not local-kernel non-ergodicity. is non-monotone, rising to near before collapsing:
| | 0.5 | 0.18 | 0.08 | 0.03 | 0.02 | 0.01 | |---|---|---|---|---|---|---| | | 305.5 | 329.5 | 250.3 | 79.3 | 15.8 | 2.2 | | ? | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ |
So the decorrelating set is and the coldest decorrelating top (the cheapest span) is . The anchor reproduces exp18's — regression intact. But the analytic cost frontier does not overlap it:
| | 0.5 | 0.25 | 0.18 | 0.05 | 0.02 | 0.01 | |---|---|---|---|---|---|---| | | 105.5 | 156.5 | 172.7 | 217.7 | 227.5 | 231.0 | | | 64 | 94 | 104 | 130 | 136 | 138 |
Decorrelation requires ; tractability requires (). There is no that BOTH decorrelates AND has : . The thermodynamic length over the span is the wall. Cost-aware decisive STOP at Stage A — no Stage-B ladder built, the -faithful round-trip cert never approached (budget_hit=False, clean completion).
This matched the pre-commitment: nonempty routed correctly to a Stage-A STOP. The prediction-precision check confirmed direction (sublinear , , costwall) but found the magnitude under-estimated — the crossing landed deeper (, not the forecast band edge) and on-DTM ran higher than the exp18 extrapolation, so the cost wall is worse than forecast ( above the predicted ).
Scope and caveats
Config-scoped: 60_12, SEED=0, , INPUT_IDX=0, single-input conditional, the reversible 4-block-Gibbs kernel, the equal-acceptance ladder family, the 10-point grid floored at . HOT-TOP-DECORRELATES-COSTWALL means "this kernel on this landscape needs for a single-span ladder," never "reversible PT cannot mix DTMs," and never a fundamentality verdict about . is a Gaussian-overlap LOWER BOUND for a multimodal landscape, so the true rung count could be higher — it corroborates the wall, it never rules mixing in. The grid floor at gives "deepest tested," not "provably deepest." Burn-in init (no exact ) means a STOP carries no robustness asymmetry. This sharpens spine Risk 5 (A2A6 antagonism), quantitatively: the reversible/self-adjoint kernel the factorization theorem requires (A2) needs to decorrelate, but a theorem-compatible tractable ladder exists only down to — and vs is the measured size of that gap on the real trained DTM. The conditional factorization stays [solid]; the operational claim stays [conjectured] (, A7 open). No tag moves.
What this feeds: this is the keystone behind the A2A6 cost-wall framing — the obstruction is relocated, not removed, so the indicated pivot is to switch families (hierarchical PT / simulated tempering / population annealing to amortize the 136-rung length), or to fix the measurement with a -robust estimator, rather than build a still-larger single-span ladder or raise the cap again.