Thermodynamic Machine Learning · MMXXVI
Experiment27.V.MMXXVIRead 4 min

Exp 2 — Block-Gibbs RBM: The Findings Survive

Entry 1

Re-run exp1's mechanism on a different model (a bipartite RBM) and a different kernel (2-block-Gibbs) to ask which discoveries are properties of the physics and which were artifacts of the sampler.

The question

experiments/exp1-exact-diag/ established three things on a fully-connected Ising chain by exact diagonalization: the slowest mode is symmetry-odd and observable-orthogonal to the gradient (P3); a weak-coupling spectral degeneracy σ=11/N\sigma = 1 - 1/N exists but might be a single-site-kernel artifact (P2); and an observable-relevant predictor tracks the operational QopQ_{op} while a naive single-scalar diagnostic blows up (P4). The open caveat was kernel-dependence: a fully-connected chain under single-site Metropolis is not a thermodynamic-machine sampler. This entry tests whether the findings survive the move to a bipartite RBM sampled by 2-block-Gibbs, m16m \le 16.

The setup

Self-contained numpy block-Gibbs is the primary instrument; thrml is an optional gated fidelity check. The run (experiment.py) covers 240 cells in 333 s, pre-committed and frozen at ce17b56. Units are fixed for cross-kernel fairness: 1 sweep == 2m2m single-spin updates (a single-site sweep is 2m2m random-scan steps; a block sweep is one alternating visible/hidden update), so KK, BB, and τint\tau_{int} are all in sweeps. The gradient observables are the even RBM couplings fij=vihjf_{ij} = -v_i h_j; the generic/odd scalar is the magnetization-like mm. Sokal convention: τint=1+2ρ\tau_{int} = 1 + 2\sum \rho, with Var[fˉ]=τintVar/KVar[\bar f] = \tau_{int}\cdot Var/K.

P1 validity gate first. On single-site overlap cells (m{4,6}m \in \{4,6\}), the MC-estimated aggregate τ\tau agrees with the exact joint-eigh τ\tau to median 2.2% rel-err (n=48n=48, well under the 25% bar). The at-scale estimators (QopQ_{op} via multi-seed MSE, τint\tau_{int} via Sokal) therefore have a validated ground truth. PASS.

The result

P3 — symmetry orthogonality is model-driven, survives. CONFIRMED. In the overlap-regime single-site eigh, every slowest mode has parity 1-1 (odd) and observable overlap 3.5×1017\le 3.5\times10^{-17} with the even fijf_{ij} — numerically exact orthogonality, now in the RBM. The timescale ordering τint[m]>τint[f]\tau_{int}[m] > \tau_{int}[f] holds for both kernels (median ratio 1.651.65; 1.85\sim 1.852.282.28 for random m=8m=8 across β\beta). So exp1's Discovery (a) is not a single-site or fully-connected artifact.

P2 — weak-coupling degeneracy is kernel-driven. CONFIRMED (odd channel). Block-Gibbs lifts the degeneracy: τint[m]\tau_{int}[m] is 12\le\tfrac12 the single-site value in 77% of weak-β\beta cells (median ratio 0.48, n=30n=30, β=0.3\beta=0.3). The block advantage is strongest at weak β\beta (0.540.54) and fades toward strong β\beta (0.680.68), as expected — strong-β\beta slow modes are metastable basins block also struggles with. The automated probe used the even τint[f]\tau_{int}[f] and read FAIL (median 0.780.78, never 12\le\tfrac12) — because τf\tau_f is symmetry-decoupled from the odd weak-coupling modes, which is exactly P3. The wrong-observable failure re-confirms P3 rather than refuting P2. exp1's "σ=11/N\sigma = 1 - 1/N is the single-site kernel's artifact" caveat is thus resolved: kernel-driven, in the sector P3 identifies.

P4 — observable-relevant predictor tracks at scale; naive fails where symmetry matters. PASS (sharp conditional). QstructQ_{struct} (observable-relevant) tracks QopQ_{op} in 92–99% of cells across m=416m = 4 \to 16, both kernels, including m8m \ge 8 beyond the spectrum-overlap regime — it extrapolates. The naive odd-τm\tau_m predictor (the un-projected DTM-ryyr_{yy} diagnostic) tracks only 14% of cells when τm/τf>2.5\tau_m/\tau_f > 2.5 (median ratio 0.140.14, max separation 71.871.8), versus 90% when τm/τf<1.5\tau_m/\tau_f < 1.5. It fails precisely in the symmetry-separated regime. Calibration against exp1: exp1's fully-connected naive blew up by 102610^{26}103010^{30} (an RR-formula divide-by-zero); exp2's naive is finite and fails by the τm/τf\tau_m/\tau_f ratio. Same mechanism (wrong, symmetry-odd mode), different magnitude.

THRML fidelity — CONFIRMED. thrml==0.1.3 resolved on PyPI and installed cleanly in an isolated venv (jax 0.10.1). On one shared RBM (random, m=4, β=0.8, identical WW), thrml's bipartite block-Gibbs (IsingEBM + IsingSamplingProgram) reproduced the self-contained sampler's τint[E]\tau_{int}[E]: 1.631 vs 1.653, a 1.4% difference (AGREE). The kernel structure is faithful to Extropic's library; thrml was used vanilla, no patch (thrml_xcheck.py).

Scope and caveats

What this does not show: nothing about asymptotic prevalence in trained DTMs — these are controlled small–moderate RBMs (m16m \le 16, CPU smoke test). Positive statements read "on controlled small–moderate RBMs," never "validated for DTM training." The QstructQopQ_{struct} \leftrightarrow Q_{op} consistency is near-tautological (same chains; median ratio 1.0\sim 1.0 confirms MCMC bookkeeping) — the genuinely new content is P2 (kernel-dependence), P3 (kernel-independence of the symmetry fact), and the naive diagnostic's sharp conditional failure. Block-kernel parity is MC-supported, not eigh-confirmed: the 2-block alternating sweep is non-reversible, so it is not diagonalized here; decoupling rests on the consistent τm>τf\tau_m > \tau_f pattern across all block cells. No conjectured → validated flip — empirical / construction-confirmed only; the factorization stays [conjectured].


What this feeds: resolves the exp1 kernel-caveat (Risk 1) and annotates the validation hook — the DTM ryyr_{yy} diagnostic must be measured on the even gradient observables, since an odd-scalar ryyr_{yy} mistimes the gradient SNR by the τm/τf\tau_m/\tau_f ratio; exp3 tests prevalence in trained DTMs.

— fin. —