Thermodynamic Machine Learning · MMXXVI
Experiment18.VI.MMXXVIRead 4 min

Exp 18 — PT Replica Feasibility: PT-MARGINAL

Entry 21

Shrinking the per-edge temperature step alone moves swap acceptance ×33, refuting the saturating-ceiling prediction — but a uniform schedule over a non-uniform specific heat over-resolves the cold end and starves the hot, so no in-band ladder exists at R24R \le 24.

This is the complete technical record for experiments/exp18-pt-replica-feasibility/. Here we keep the numbers, the gates, and the claim-status discipline.

The question

After exp17 was held and the post-exp15 baseline corrected, the question is feasibility-first: on the corrected trained 60_12 DTM at t=200t=200, does shrinking the per-edge step alone — a uniform-Δα reversible-PT ladder over the fixed span [0.5,1.0][0.5, 1.0] — carry swap acceptance into the frozen band [0.15,0.60][0.15, 0.60] on every edge, and deliver NRT=5\ge N_{RT}=5 round-trips, at a tractable RR? The probe stops at the ladder, upstream of any TOT_O, speedup, P4P4/F4F4, or QopQ_{op}-vs-QstructQ_{struct}^{\perp} read. It is MEASURE-ONLY: no cell moves a tag.

The setup

Stage A is a flat pilot uniform-Δα acceptance sweep: R{4,6,8,12,16,24}R \in \{4,6,8,12,16,24\}, L_LADDER=2000, N_LADDER_CHAINS=32, pooled over LADDER_SEEDS=(300,301,302), span fixed [0.5,1.0][0.5,1.0], band [0.15,0.60][0.15,0.60]. Round-trips are a diagnostic at flat pilot LL, not a gate (all rows showed round_trips_min_diag = 0). Six provenance/cert gates passed first — most load-bearing is the trained-weight refresh: stale_vs_trained = 60.64, refreshed_vs_trained = 0.0, so the per-replica LOCAL kernel targets the same trained parameterization as the swap energy. exp15/16 silently used INIT weights for the local kernels; this run's acceptances are therefore a valid read of the trained DTM, unlike them. Ran on an H200, jax=0.10.1, wall 2.143 GPU-h, DECISION: PROCEED.

The result

Two competing edges, never simultaneously in band:

| RR | per-edge Δα | cold edge | hot edge (worst) | band_ok | hot_ok | |---:|---:|---:|---:|:--:|:--:| | 4 | 0.1667 | 0.0182 | 0.0024 | ✗ | ✓ | | 6 | 0.1000 | 0.0843 | 0.0056 | ✗ | ✓ | | 8 | 0.0714 | 0.1836 | 0.0168 | ✗ | ✓ | | 12 | 0.0455 | 0.3159 | 0.0836 | ✗ | ✓ | | 16 | 0.0333 | 0.4359 | 0.1447 | ✗ | ✓ | | 24 | 0.0217 | 0.6008 | 0.2709 | ✗ | ✓ |

Cold-edge acceptance rises monotonically ×33 (0.0182 → 0.6008), crossing the floor (0.15) at R7R \approx 788 and overshooting the ceiling by R=24R=2411 of 23 edges exceed 0.60 (up to 0.664). The hot edge rises too (0.0024 → 0.2709) but lags, crossing the floor only at R18R \approx 18 (extrapolated; R{18,20,22}R\in\{18,20,22\} were off-grid). So no uniform ladder at R24R \le 24 has every edge in band at once: at small RR the hot edges starve below 0.15; by the RR where the hot edge clears, the cold edges have already overshot 0.60. R_star=null, no Stage B entered.

Why uniform is the wrong schedule. The single-replica scan gives a non-uniform specific heat C(α)C(\alpha) ranging 13,143 → 69,078 (min near α0.56\alpha\approx 0.56, max near α0.88\alpha\approx 0.88),  ⁣Cdα=101.5\int\!\sqrt{C}\,d\alpha = 101.5, ⇒ Gaussian-overlap Rpredicted=61R^*_{predicted}=61 for the optimal per-edge A=0.234A=0.234. Per-edge acceptance tracks ΔαC\Delta\alpha\cdot\sqrt{C} locally, so a constant Δα\Delta\alpha yields non-constant acceptance — exactly the cold-overshoot / hot-starve split. The hottest replica at αtop=0.5\alpha_{top}=0.5 also does not itself decorrelate: τhot=317.8\tau_{hot}=317.8 (hot_ok held only because L_LADDER=2000 5317.81589\ge 5\cdot 317.8 \approx 1589, ~26% Sokal headroom — a marginally-resolved estimate).

This is PT-MARGINAL: replica-/placement-bound but not hopeless. The primary §2 prediction — cold-edge intercept saturating <0.15<0.15 as Δα0\Delta\alpha\to 0, i.e. PT-CEILING-INFEASIBLE — is refuted in direction; acceptance moves, so the adjacent-overlap-wall reading is excluded. This was the documented §2 "informative surprise," not a miss: the confounded geometric αR\alpha_R-sweep fit does not transfer to the uniform-Δα family. The hot-end prediction (τhot\tau_{hot} stays large, fork pointed at βtop0\beta_{top}\to 0) was confirmed in direction. No threshold was relaxed (SWAP_BAND/N_RT/SOKAL_C frozen verbatim).

Scope and caveats

It does not show that the DTM cannot be mixed. It is config-scoped to one trained DTM (60_12, SEED=0, t=200t=200, INPUT_IDX=0), one span, one uniform ladder family. It is not a pass and feeds no claim — a ladder still does not mix this DTM. The grid skips R{18,20,22}R\in\{18,20,22\}, so a narrow uniform window [18,22]\approx[18,22] might graze the band on a knife-edge, but τhot318\tau_{hot}\approx 318 makes the Stage-B round-trip gate doubtful and the schedule fragile. Rpredicted=61R^*_{predicted}=61 is a Gaussian-unimodal lower bound for a multimodal landscape — it can corroborate infeasibility, never rule mixing in. Burn-in init (no exact πr\pi_r): a STOP reads "PT does not mix under this schedule at tractable RR," not "the DTM cannot mix." Conditional factorization stays [solid], operational claim stays [conjectured] — both unchanged, no tag flip.

What it feeds

This corroborates the spine's Risk-5 (A2↔A6 antagonism) and sharpens the post-exp17 Risk-4 picture, but writes no annotation. It refines the corrected baseline (../exp15-recheck-trained-weights/, LADDER-INADEQUATE-TRAINED) from "R4 fails" to "uniform-Δα fails by placement, hot end τ318\tau\approx 318." The diagnostics point two complementary pivots: equal-acceptance placement (rungs so ΔαC\Delta\alpha\cdot\sqrt{C}\approx const) to fix acceptance shape, and a hotter top (βtop0\beta_{top}\to 0) to fix hot-end ergodicity. The pivot is the researcher's; this run only narrows the menu.


What this feeds: exp19 — equal-acceptance placement plus a hotter top, reusing the same A2-reversible machinery and a fresh gate-1.

— fin. —