Exp 2 — Block-Gibbs RBM: The Findings Survive · Thermodynamic Machine Learning

Re-run exp1's mechanism on a different model (a bipartite RBM) and a different kernel (2-block-Gibbs) to ask which discoveries are properties of the physics and which were artifacts of the sampler.

The question

experiments/exp1-exact-diag/ established three things on a fully-connected Ising chain by exact diagonalization: the slowest mode is symmetry-odd and observable-orthogonal to the gradient (P3); a weak-coupling spectral degeneracy $\sigma = 1 - 1/N$ exists but might be a single-site-kernel artifact (P2); and an observable-relevant predictor tracks the operational $Q_{op}$ while a naive single-scalar diagnostic blows up (P4). The open caveat was kernel-dependence: a fully-connected chain under single-site Metropolis is not a thermodynamic-machine sampler. This entry tests whether the findings survive the move to a bipartite RBM sampled by 2-block-Gibbs, $m \le 16$ .

The setup

Self-contained numpy block-Gibbs is the primary instrument; thrml is an optional gated fidelity check. The run (experiment.py) covers 240 cells in 333 s, pre-committed and frozen at ce17b56. Units are fixed for cross-kernel fairness: 1 sweep $=$ $2m$ single-spin updates (a single-site sweep is $2m$ random-scan steps; a block sweep is one alternating visible/hidden update), so $K$ , $B$ , and $\tau_{int}$ are all in sweeps. The gradient observables are the even RBM couplings $f_{ij} = -v_i h_j$ ; the generic/odd scalar is the magnetization-like $m$ . Sokal convention: $\tau_{int} = 1 + 2\sum \rho$ , with $Var[\bar f] = \tau_{int}\cdot Var/K$ .

P1 validity gate first. On single-site overlap cells ( $m \in \{4,6\}$ ), the MC-estimated aggregate $\tau$ agrees with the exact joint-eigh $\tau$ to median 2.2% rel-err ( $n=48$ , well under the 25% bar). The at-scale estimators ( $Q_{op}$ via multi-seed MSE, $\tau_{int}$ via Sokal) therefore have a validated ground truth. PASS.

The result

P3 — symmetry orthogonality is model-driven, survives. CONFIRMED. In the overlap-regime single-site eigh, every slowest mode has parity $-1$ (odd) and observable overlap $\le 3.5\times10^{-17}$ with the even $f_{ij}$ — numerically exact orthogonality, now in the RBM. The timescale ordering $\tau_{int}[m] > \tau_{int}[f]$ holds for both kernels (median ratio $1.65$ ; $\sim 1.85$ – $2.28$ for random $m=8$ across $\beta$ ). So exp1's Discovery (a) is not a single-site or fully-connected artifact.

P2 — weak-coupling degeneracy is kernel-driven. CONFIRMED (odd channel). Block-Gibbs lifts the degeneracy: $\tau_{int}[m]$ is $\le\tfrac12$ the single-site value in 77% of weak- $\beta$ cells (median ratio 0.48, $n=30$ , $\beta=0.3$ ). The block advantage is strongest at weak $\beta$ ( $0.54$ ) and fades toward strong $\beta$ ( $0.68$ ), as expected — strong- $\beta$ slow modes are metastable basins block also struggles with. The automated probe used the even $\tau_{int}[f]$ and read FAIL (median $0.78$ , never $\le\tfrac12$ ) — because $\tau_f$ is symmetry-decoupled from the odd weak-coupling modes, which is exactly P3. The wrong-observable failure re-confirms P3 rather than refuting P2. exp1's " $\sigma = 1 - 1/N$ is the single-site kernel's artifact" caveat is thus resolved: kernel-driven, in the sector P3 identifies.

P4 — observable-relevant predictor tracks at scale; naive fails where symmetry matters. PASS (sharp conditional). $Q_{struct}$ (observable-relevant) tracks $Q_{op}$ in 92–99% of cells across $m = 4 \to 16$ , both kernels, including $m \ge 8$ beyond the spectrum-overlap regime — it extrapolates. The naive odd- $\tau_m$ predictor (the un-projected DTM- $r_{yy}$ diagnostic) tracks only 14% of cells when $\tau_m/\tau_f > 2.5$ (median ratio $0.14$ , max separation $71.8$ ), versus 90% when $\tau_m/\tau_f < 1.5$ . It fails precisely in the symmetry-separated regime. Calibration against exp1: exp1's fully-connected naive blew up by $10^{26}$ – $10^{30}$ (an $R$ -formula divide-by-zero); exp2's naive is finite and fails by the $\tau_m/\tau_f$ ratio. Same mechanism (wrong, symmetry-odd mode), different magnitude.

THRML fidelity — CONFIRMED. thrml==0.1.3 resolved on PyPI and installed cleanly in an isolated venv (jax 0.10.1). On one shared RBM (random, m=4, β=0.8, identical $W$ ), thrml's bipartite block-Gibbs (IsingEBM + IsingSamplingProgram) reproduced the self-contained sampler's $\tau_{int}[E]$ : 1.631 vs 1.653, a 1.4% difference (AGREE). The kernel structure is faithful to Extropic's library; thrml was used vanilla, no patch (thrml_xcheck.py).

Scope and caveats

What this does not show: nothing about asymptotic prevalence in trained DTMs — these are controlled small–moderate RBMs ( $m \le 16$ , CPU smoke test). Positive statements read "on controlled small–moderate RBMs," never "validated for DTM training." The $Q_{struct} \leftrightarrow Q_{op}$ consistency is near-tautological (same chains; median ratio $\sim 1.0$ confirms MCMC bookkeeping) — the genuinely new content is P2 (kernel-dependence), P3 (kernel-independence of the symmetry fact), and the naive diagnostic's sharp conditional failure. Block-kernel parity is MC-supported, not eigh-confirmed: the 2-block alternating sweep is non-reversible, so it is not diagonalized here; decoupling rests on the consistent $\tau_m > \tau_f$ pattern across all block cells. No conjectured → validated flip — empirical / construction-confirmed only; the factorization stays [conjectured].

What this feeds: resolves the exp1 kernel-caveat (Risk 1) and annotates the validation hook — the DTM $r_{yy}$ diagnostic must be measured on the even gradient observables, since an odd-scalar $r_{yy}$ mistimes the gradient SNR by the $\tau_m/\tau_f$ ratio; exp3 tests prevalence in trained DTMs.