Re-run exp1's mechanism on a different model (a bipartite RBM) and a different kernel (2-block-Gibbs) to ask which discoveries are properties of the physics and which were artifacts of the sampler.
The question
experiments/exp1-exact-diag/ established three things on a fully-connected Ising chain by exact diagonalization: the slowest mode is symmetry-odd and observable-orthogonal to the gradient (P3); a weak-coupling spectral degeneracy exists but might be a single-site-kernel artifact (P2); and an observable-relevant predictor tracks the operational while a naive single-scalar diagnostic blows up (P4). The open caveat was kernel-dependence: a fully-connected chain under single-site Metropolis is not a thermodynamic-machine sampler. This entry tests whether the findings survive the move to a bipartite RBM sampled by 2-block-Gibbs, .
The setup
Self-contained numpy block-Gibbs is the primary instrument; thrml is an optional gated fidelity check. The run (experiment.py) covers 240 cells in 333 s, pre-committed and frozen at ce17b56. Units are fixed for cross-kernel fairness: 1 sweep single-spin updates (a single-site sweep is random-scan steps; a block sweep is one alternating visible/hidden update), so , , and are all in sweeps. The gradient observables are the even RBM couplings ; the generic/odd scalar is the magnetization-like . Sokal convention: , with .
P1 validity gate first. On single-site overlap cells (), the MC-estimated aggregate agrees with the exact joint-eigh to median 2.2% rel-err (, well under the 25% bar). The at-scale estimators ( via multi-seed MSE, via Sokal) therefore have a validated ground truth. PASS.
The result
P3 — symmetry orthogonality is model-driven, survives. CONFIRMED. In the overlap-regime single-site eigh, every slowest mode has parity (odd) and observable overlap with the even — numerically exact orthogonality, now in the RBM. The timescale ordering holds for both kernels (median ratio ; – for random across ). So exp1's Discovery (a) is not a single-site or fully-connected artifact.
P2 — weak-coupling degeneracy is kernel-driven. CONFIRMED (odd channel). Block-Gibbs lifts the degeneracy: is the single-site value in 77% of weak- cells (median ratio 0.48, , ). The block advantage is strongest at weak () and fades toward strong (), as expected — strong- slow modes are metastable basins block also struggles with. The automated probe used the even and read FAIL (median , never ) — because is symmetry-decoupled from the odd weak-coupling modes, which is exactly P3. The wrong-observable failure re-confirms P3 rather than refuting P2. exp1's " is the single-site kernel's artifact" caveat is thus resolved: kernel-driven, in the sector P3 identifies.
P4 — observable-relevant predictor tracks at scale; naive fails where symmetry matters. PASS (sharp conditional). (observable-relevant) tracks in 92–99% of cells across , both kernels, including beyond the spectrum-overlap regime — it extrapolates. The naive odd- predictor (the un-projected DTM- diagnostic) tracks only 14% of cells when (median ratio , max separation ), versus 90% when . It fails precisely in the symmetry-separated regime. Calibration against exp1: exp1's fully-connected naive blew up by – (an -formula divide-by-zero); exp2's naive is finite and fails by the ratio. Same mechanism (wrong, symmetry-odd mode), different magnitude.
THRML fidelity — CONFIRMED. thrml==0.1.3 resolved on PyPI and installed cleanly in an isolated venv (jax 0.10.1). On one shared RBM (random, m=4, β=0.8, identical ), thrml's bipartite block-Gibbs (IsingEBM + IsingSamplingProgram) reproduced the self-contained sampler's : 1.631 vs 1.653, a 1.4% difference (AGREE). The kernel structure is faithful to Extropic's library; thrml was used vanilla, no patch (thrml_xcheck.py).
Scope and caveats
What this does not show: nothing about asymptotic prevalence in trained DTMs — these are controlled small–moderate RBMs (, CPU smoke test). Positive statements read "on controlled small–moderate RBMs," never "validated for DTM training." The consistency is near-tautological (same chains; median ratio confirms MCMC bookkeeping) — the genuinely new content is P2 (kernel-dependence), P3 (kernel-independence of the symmetry fact), and the naive diagnostic's sharp conditional failure. Block-kernel parity is MC-supported, not eigh-confirmed: the 2-block alternating sweep is non-reversible, so it is not diagonalized here; decoupling rests on the consistent pattern across all block cells. No conjectured → validated flip — empirical / construction-confirmed only; the factorization stays [conjectured].
What this feeds: resolves the exp1 kernel-caveat (Risk 1) and annotates the validation hook — the DTM diagnostic must be measured on the even gradient observables, since an odd-scalar mistimes the gradient SNR by the ratio; exp3 tests prevalence in trained DTMs.