With exp13's calibration frozen and fresh held-out seeds, the primary reversible PT kernel finally cuts the calibrated operational time by the compute-normalized criterion — a positive small-family precursor, not the at-scale validation.
This is the complete technical record for experiments/exp14-operational-reread/. It is the corrected operational read on experiments/exp13/'s calibration, and the third beat of an arc that began with the reversible-kernel timescale work in the -probe entry. Ran 2026-06-15, CPU, float64, 37.8 s wall; gates frozen at pre-commitment ff865ab (gate-1) and runner 6c53ac6 (gate-2), reusing the frozen exp13 calibration (44a05a9) and the exp12 pt_kernel (a976d80). Reproduce with P0_MODE=full HOST_RAM_GB=8 python3 op_reread.py → op_reread.json (MEASURE-ONLY).
The question
exp12 showed parallel tempering (PT) cut by – but the operational time never stabilized in the registered windows — Outcome F, measurement-limited. exp13 then established that is calibratable (Cal-STABLE under exact- init), so the frozen S-ADQ failure was a gate-specification artifact, not a real one. exp14 asks the operational question that exp12 could not answer: with the corrected gates and the calibrated swap schedule , does the primary reversible PT kernel actually cut the calibrated by the frozen compute-normalized criterion on the registered small RBM family?
The setup
The verdict metric is a compute-normalized speedup,
where is the exp13-calibrated, frozen value — it is not re-estimated here. exp14's fresh held-out seeds (OP_SEEDS=200..204) independently confirm two things against that frozen : window adequacy (F1 in band at both and ) and P4. So the precursor is a conjunction: (frozen- compute-normalized speedup ) and (fresh-seed confirmation that the operational windows are asymptotic against that , with P4 in band). The family is ; cells span the primary kernel at , the convex kernel , a unimodal control C-uni, and C-deep2.
The result
Outcome S-A on C-deep R4 (primary) and R6 (corroborating). The full cell table:
| cell | window-adeq (F1@) | speedup vs 2.0 | P2 | P4 | (proj.) | outcome | |---|---|---|---|---|---|---| | C-deep R4 primary | pass | 2.42 (+21%) | pass | pass | 0.22 () | S-A | | C-deep R6 primary | pass | 2.12 (+6%) | pass | pass | 0.42 () | S-A | | C-deep R8 primary | pass | 1.86 | fail | pass | 0.48 | S-D | | C-deep R4 / R6 convex | pass | 1.60 / 0.96 | fail | pass | 0.07 / 0.12 | S-D | | C-deep R8 convex | fail (F1@20 out) | 0.59 | fail | — | 0.08 | W-INADQ | | C-uni R4 (diagnostic) | pass | 0.73 | fail | pass | 0.21 | S-D | | C-deep2 R4 | fail (F1@20 out) | 0.48 | — | pass | 0.27 | W-INADQ |
R4 is the robust pass; R6 corroborates but is not co-equal. R4 clears the bar by — beyond the calibration's own -stabilization tolerance, so it survives the calibration uncertainty. R6 clears by only — within that tolerance, so R6's pass could flip under calibration noise. Treat R6 as corroborating, not co-equal evidence.
The projected is the lifted-observable VAC corrected by the multimodal calibration ratio ; it is for both S-A cells ( floor) — a diagnostic, not a full spectral certificate. S-A is read regardless of the §8 tuning-adequacy gate: a positive P2 is not blocked by §8 (high swap-accept is ladder redundancy already charged inside ); §8 only splits P2-failures into S-C versus S-D.
The non-S-A cells confirm the design choices. R8 primary erodes the net benefit ( at drops speedup to ) → S-D. The convex kernel mixes far slower (raw –, –) → S-D / W-INADQ, confirming the primary . C-uni shows the expected PT overhead on a unimodal target (speedup ). C-deep2 and R8-convex are W-INADQ because F1@ is in band but F1@ is not — is still pre-asymptotic there.
Scope and caveats
Scoped to the registered small RBM family (); at-scale G2 untouched. This is a small-family precursor, not the at-scale G2 validation. The speedup rests on a frozen — the fresh seeds re-confirm window adequacy and P4, but they do not re-measure or the speedup. R6 sits inside the calibration tolerance. is a projected diagnostic, not a spectral certificate. No fundamentality claim. No tag moves: the operational tier stays [conjectured] and the conditional factorization stays [solid]. The central-spine A2A6 sharpening — reversible-kernel acceleration now operationally demonstrated on the controlled family — is researcher-conferred prose, not a status flip.
The arc lands here: exp12 cut but could not stabilize (Outcome F); exp13 showed is calibratable and the failure was a gate artifact; exp14 demonstrates that, with corrected gates and the calibrated , the primary PT kernel at R4 (robust) and R6 (corroborating) cuts the calibrated at least compute-normalized — acceleration operationally demonstrated, not merely -suggested.
What this feeds
Per the outcome map, exp14's S-A confers authorization (researcher-conferred, 2026-06-15; see p0_decision.md) for a GPU DTM-MNIST PT P0 doubling probe — the exp4-style non-circular probe with the reversible PT kernel on the real DTM, where 4-block Gibbs gave . This is authorization, not PROCEED: the separate p0_decision.md must still declare PROCEED/HALT plus GPU-hour limits. Primary arm is R4 primary (strongest margin, lowest replica cost); R6 is a budget-dependent robustness arm. No GPU is committed here; no tag moves.
What this feeds: the GPU DTM PT P0 authorization gate and the A2A6 spine annotation — recording the first operational demonstration of reversible-kernel acceleration on the controlled family, while leaving the operational factorization tier exactly where it was.