Thermodynamic Machine Learning · MMXXVI
Proof1.VI.MMXXVIRead 4 min

The Proof Program: O1–O6 and the Conditional Factorization

Entry 4

This is the diagnostic ledger that enumerates exactly what must be true for the corrected predictor to factorize — and is honest that all but one piece is textbook reversible-MCMC math.

The question

The spine conjectures that the operational gradient-SNR equals the structural predictor: Qop=g2/Eg^g2Qstruct(θ,K)Q_{op} = \lVert g\rVert^2 / \mathbb{E}\lVert \hat{g}-g\rVert^2 \asymp Q_{struct}^{\perp}(\theta,K), with Qstruct=(K/2)g2/TOQ_{struct}^{\perp} = (K/2)\lVert g\rVert^2 / T_O. The question this entry answers is bookkeeping, not physics: which sub-claims would carry that factorization from [conjectured] toward [proven-here], and which of them are actually discharged? The source synthesis/proof-sketch-q-struct-perp.md is a diagnostic page — it moves no tag itself.

The setup: the assumption package A1–A9

The factorization is asserted only in a fixed-θ\theta, finite-dimensional, reversible regime. The package is nine tagged assumptions:

  • A1 fixed θ\theta; A2 πθ\pi_\theta-reversibility (self-adjoint PθP_\theta, real spectrum — single-site random-scan Gibbs and random-scan 2-block; not alternating-scan); A3 irreducibility + aperiodicity (σ1=1>σ2σd0\sigma_1=1>\sigma_2\geq\dots\geq\sigma_d\geq 0); A4 observable regularity (faL2(πθ)f_a\in L^2(\pi_\theta)).
  • A5 burn-in adequacy Bc/γeffB\gtrsim c/\gamma_{eff}; A6 window adequacy, sharpened to Kτint[fa]K \gg \tau_{int}[f_a] for the slowest aa (finite-window relative error 1/(γK)=1/(2ESS)\approx 1/(\gamma K)=1/(2\cdot\mathrm{ESS})).
  • A7 overlapping-bulk relaxation — a rate condition γbulk:=1maxjC(O),μσ>0σjΩ(1)\gamma_{bulk}:=1-\max_{j\notin C^*(O),\,\mu_\sigma>0}\sigma_j\geq\Omega(1), not an eigenvalue-ordering gap (the cluster and bulk σ\sigma-ranges may interleave; that gap is <0<0 on exp1). The μσ>0\mu_\sigma>0 restriction is essential or A7 is vacuous on its own motivating instances.
  • A8 threshold robustness (C(O)C^*(O) insensitive to ε\varepsilon); A9 symmetry-compatibility (a finite group GG with invariant πθ\pi_\theta, equivariant PθP_\theta, invariant faf_a — canonically the free global Z2\mathbb{Z}_2 flip; independent of A2).

experiments/exp1-exact-diag/ satisfies A9 only for the even pairwise observables under random-scan single-site Gibbs — not the odd bias gradients nor the b0b\neq 0 follow-up.

The result: O1 proven-here, O2–O6 solid

The six obligations carry explicit dependencies. O1 is foundational; the rest reference its projection-vs-conditioning resolution.

  • O1 — projection in L2(πθ)L^2(\pi_\theta), not state-space conditioning. The observable-projection ΠCu=jC(O)u,φjπφj\Pi_C u = \sum_{j\in C^*(O)}\langle u,\varphi_j\rangle_\pi\,\varphi_j is an orthogonal function-subspace projection under the single unmodified measure πθ\pi_\theta — categorically different from conditioning on a sector of configurations. For the global Z2\mathbb{Z}_2 flip there is no fixed-point set in Ω={1,1}N\Omega=\{-1,1\}^N, so "πeven\pi|_{even}" is a category error; even gradients live in H+H^+, the odd slow mode φ2\varphi_2 in HH^-, with fa,φπ=0\langle f_a,\varphi_-\rangle_\pi=0. The dedicated O1 entry holds the rigorous argument; the load-bearing lemma O1.c (fundamental-domain invariance of the stationary SNR objects, via an intertwining isometry making TOT_O, g2\lVert g\rVert^2, QstructQ_{struct}^{\perp} identical term-by-term) is the proven-here lemma — the wiki's first terminal tag. O1.a/O1.b are [solid]. Empirically exp1+exp2 measured fa,φ2π3.5×1017\langle f_a,\varphi_2\rangle_\pi \leq 3.5\times10^{-17} — confirmation the HH-projection form works, not proof that conditioning is unnecessary.

O2–O6 are a compact summary — each is textbook reversible-MCMC math, graded [solid], NOT proven-here, each derived on its own page:

  • O2 — the fixed-θ\theta MCMC-CLT: Sa(θ)=2τint[fa]Varπ[fa]S_a(\theta)=2\tau_{int}[f_a]\,\mathrm{Var}_\pi[f_a], aSa=2TO\sum_a S_a = 2T_O, so TOT_O is half the aggregate asymptotic variance. The finite-window correction 1/(γK)\approx 1/(\gamma K) sharpens A6.
  • O3 — bias subdominance: Biasa2/(Sa/K)m^j2σ2(B+1)/(4ESS)=o(1)\mathrm{Bias}_a^2/(S_a/K)\approx \hat{m}_{j^*}^2\sigma_*^{2(B+1)}/(4\cdot\mathrm{ESS})=o(1); the window-averaging (A6) carries subdominance, so A5 is sufficient but not the binding gate.
  • O4[ΠC,Pθ]=0[\Pi_C, P_\theta]=0 on the reversible eigenbasis (membership criterion is reversibility, not sampler name — MALA/full-refresh HMC are in, ULA/alternating-scan out); degeneracy is repaired by the basis-invariant spectral-subspace definition.
  • O5 — the harmonic-mean identity TO=(μC/γeff)(1γeff/2)(1+o(1))T_O = (\mu_C/\gamma_{eff})(1-\gamma_{eff}/2)(1+o(1)), off-cluster tail subdominant as O(γeff/ε)O(\gamma_{eff}/\varepsilon) under A7.
  • O6 — the aggregation closure: for the Euclidean QopQ_{op}, Eg^g2=aMSEa\mathbb{E}\lVert\hat{g}-g\rVert^2=\sum_a \mathrm{MSE}_a is exact and regime-free (the trace drops the real-but-invisible off-diagonal Cov[g^a,g^b]\mathrm{Cov}[\hat{g}_a,\hat{g}_b]), assembling Qop=Qstruct(1+o(1))Q_{op}=Q_{struct}^{\perp}(1+o(1)) in regime A1–A8 + plateau + F4.

Scope and caveats

After all six closed, the spine tag was split by researcher conferral, not by self-flip: the conditional factorization (regime QopQstruct\Rightarrow Q_{op}\approx Q_{struct}^{\perp}) is [solid]; the operational/unconditional claim stays [conjectured]. The gates are open at scale — exp3 fired F1 (median ratio 2.13\approx 2.13, under-equilibrated), C(O)7k12k\lvert C^*(O)\rvert \sim 7\text{k}-12\text{k}, and its alternating-block kernel is F3 (so A2 also fails). The factorization is a conditional assembly over the single proven-here lemma; this page is diagnostic and no tag flip happens here.

The six falsifiers are the operational handles for exp3: F1 finite-KK bias dominates TO/KT_O/K (ratio >1.5>1.5); F2 HH-projection vs fundamental-domain SNR diverge (1\gg 1, O1 fails); F3 non-reversibility breaks the spectral-sector argument (tracking degrades >O(20%)> O(20\%)); F4 positive-phase MSE comparable to 2TO/K2T_O/K; F5 moving-θ\theta scope-falsifier (Δθτint/K>0.1\lVert\Delta\theta\rVert\tau_{int}/K>0.1); F6 threshold-ε\varepsilon instability for C(O)C^*(O).


What this feeds: the operational claim now rests entirely on closing A7 and KτintK\gg\tau_{int} at scale — the exp3 redesign targets F1/F3 directly, since those are the gates exp3 already tripped.

Sources

  • Younes (1999), §7 Thm 3 — the fixed-θ\theta MCMC-CLT variance object S(θ)S(\theta) grounding TOT_O (object provenance only, no SNR factorization).
  • Yuille (2004) — CD-from-data endpoint bias eigen-expansion (Risk 6 row; outside A5).
  • Levin & Peres (2017), Markov Chains and Mixing Times — reversible-chain eigenbasis, geometric bias, time-average variance (the [solid] backbone for O2–O5).
— fin. —