Semantic Physics → Framework 15
Framework 15 · Audited Research Program · May 2026

Framework 15.
The Audited Program.

The Semantic Deviation research program in externally-evaluable form: three canonical operationalizations of the semantic field, four pre-registered falsifiable predictions, mechanism-design anti-Goodhart machinery, and sustained citational grounding in alignment, mechanistic interpretability, and causal-inference literature.

The canonical entry to Framework 15 is EA-GLAS-02, a self-contained empirical white paper presenting the measurement program for the Semantic Deviation Principle. Two primary operationalizations of the semantic field. Pre-registered falsifiable predictions with named datasets and frozen checkpoints. A DPO training experiment. Six mechanism-design anti-Goodhart protections. 42 references across alignment, interpretability, psycholinguistics, and causal inference. A reader who has never encountered the surrounding apparatus can read the paper, evaluate its claims, and act on its roadmap.

Canonical: EA-GLAS-02 v1.0 Author: Nobel Glas DOI: 10.5281/zenodo.20271783 License: CC BY 4.0 Length: ~4,200 words · 42 references

§I — The Audit

EA-GLAS-02: Measuring Semantic Deviation: Operationalizations, Experiments, and Falsification Conditions.

EA-GLAS-02 takes the Semantic Deviation Principle (Sharks 2026) and the program's four pre-registered protocol papers as input, and returns a narrowed, citationally grounded, externally evaluable statement of the technical core. It does not amend the founding formulation. It does not depend on the institutional architecture that has accreted around the formulation. It is the canonical entry to Framework 15.

Canonical Object

Measuring Semantic Deviation: Operationalizations, Experiments, and Falsification Conditions

EA-GLAS-02 v1.0 · ~4,200 words · 42 references · DOI 10.5281/zenodo.20271783

A self-contained empirical white paper. Defines meaning as time-integrated divergence from the most probable trajectory of a semantic field (extending Bar-Hillel & Carnap 1953 into distributional and temporal domains). Two primary operationalizations: F1 (closed-system trajectory deviation, counterfactual read from logits) and F2 (retrieval response deviation, 90-day prospective window). Signed per-token deviation as tractable proxy. Falsifiable prediction: AI-generated text exhibits negative mean signed deviation. DPO training experiment using the deviation primitive to generate preference pairs. Six mechanism-design anti-Goodhart protections mapped to the Manheim & Garrabrant (2019) taxonomy. Pre-registered cheapest dangerous test with named datasets, frozen reference checkpoints, and statistical procedures. Budgeted twelve-month roadmap. 42 references.

What the audit pins.

Three canonical operationalizations of the semantic field.

The framework's load-bearing technical gap was the underspecification of the semantic field $\Psi_t(C)$. The audit pins three canonical operationalizations, each with full commitment to divergence functional, temporal weighting, and horizon.

F1
Closed-system continuation field

The conditional next-token distribution of a fixed language model checkpoint over a fixed prompt set. Divergence: KL over softmax logits (exact, base-2). Horizon: discrete token positions in a bounded generation window. Counterfactual baseline is not estimated — it is read from the logits.

Most directly computable · Cost: commodity hardware, hours
F2
Retrieval response field

Response distributions of external AI retrieval surfaces (Google AI Overview, ChatGPT with browsing, Perplexity) to a fixed query set, sampled at fixed time intervals. Divergence: Jensen-Shannon over claim-level or embedding-level representations. Horizon: 90-day default; continuous calendar time. Instrumentation-noise-sensitive; requires control surfaces and version logging.

Direct empirical access · Cost: modest API budget; calendar-time-bound
F3
Citation graph field

Forward-citation distribution over a paper corpus, evaluated through bibliometric data (OpenAlex, Semantic Scholar). Divergence: Jensen-Shannon over topic-cluster forward-citation distributions; inverse-time weighting $w(t) = 1/(t-t_0)$ default. Statistical-power constraints documented: single-paper interventions are typically underpowered at conventional $\alpha$; aggregate interventions or Bayesian hierarchical pooling required.

Long-horizon · Post-hoc · Requires bibliometric infrastructure

The narrowed audited claim.

Meaning-bearing interventions are those that produce durable restructuring of future field trajectories under a specified operationalization $\Psi_t(C)$.

The audit narrows the universal-ontology form ("meaning is deviation") to a measurement-architecture form that survives the standard counterexamples (low-token-surprisal utterances of high semantic weight) by relocating them, not by collapsing. Meaning becomes field-relative; the field must be specified; "durable" becomes the operational joint condition $\mathcal{M}_T > \tau_F$ and $\partial \mathcal{M}_T / \partial T > 0$.

Four pre-registered falsifiable predictions.

The audit specifies the cheapest dangerous test as a single short paper at <$50 in compute. Pre-registered corpora (GPT-wiki-intro / Bhat 2023, HC3 / Guo et al. 2023, OpenAlex pre-2020), exact reference checkpoint (meta-llama/Llama-3.1-8B-Instruct), and statistical procedure (two-sided Mann-Whitney U at $\alpha = 0.05$, minimum effect size Cohen's $d > 0.5$).

P1 · The slop signature
Human-labeled AI slop exhibits statistically significant negative mean signed per-token deviation $\bar{\delta}$, relative to matched human-written content, computed against a frozen open-weight reference model.
P2 · The RLHF flattening differential
Post-RLHF chat-tuned models exhibit lower mean signed deviation than pre-RLHF base models on matched prompts. The deviation statistic captures the trajectory-flattening that the framework predicts cross-entropy convergence pressure produces.
P3 · Effect-size scaling
The Slop-vs-Human deviation differential is stable or grows with model scale across the Llama-3.1 family. If the differential disappears at scale, the framework's predictions are small-model artifacts.
P4 · Cross-judge consistency
The differential replicates when computed against a different reference model. Spearman rank correlation between per-output $\bar{\delta}$ rankings under Llama and Mistral exceeds 0.7. If not, the deviation statistic is judge-specific and the intrinsic-property claim fails.

Anti-Goodhart mechanism design.

The audit replaces philosophical anti-extractive commitment with six concrete mechanism-design protections, each operationally calibrated: entropy-floor capping at $H_{\min} = 0.5$ bits; provenance-weighted damping by retention score $\pi$; saturation threshold $\tau$ at the 95th percentile of an OpenAlex 10,000-document calibration corpus (pre-registered); rolling-window variance penalty against memetic volatility farming; reference-model KL anchoring inherited from standard DPO; adversarial judge validation at $\geq 1000$ strings per category across three failure modes; black-box judge replacement test as load-bearing robustness check.

Component decomposition.

The audit specifies a six-condition ablation design (Model-Base, Model-CE, Model-π, Model-Dev, Model-Coh, Model-Full) to isolate the contribution of provenance, deviation, and coherence components to the framework's training-intervention uplift. The prior prediction, grounded in Ji et al. 2023's hallucination survey and Min et al. 2023's FActScore methodology: provenance carries more independent uplift than signed deviation. Either outcome is informative; the current bundled design produces neither.

§II — The Pre-Registered Protocols

The empirical apparatus the audit operates on.

Four pre-registered protocol papers supply the operational machinery the audit evaluates. Each is deposited at a stable DOI with falsification conditions frozen at deposit, code and judge-model commitments specified, and budgets honestly stated. They are presented here as the materials on which the audit operates; the audit is the canonical entry.

Founding formulation · Sharks v0.2 Final

The Semantic Deviation Principle

The founding formulation. Defines meaning as time-integrated divergence from the most-probable trajectory of a semantic field. Three measures: raw ($\mathcal{M}_T$), provenance-resolved ($\mathcal{M}_T^\pi$), normative ($\mathcal{V}_T$). Tiered protocol (Tier 1 prospective, Tier 2 synthetic-control, Tier 3 historical bounding). Recursive baselines for path-dependent semantic fields. The v0.2 Final text is preserved unchanged at its specific version DOI; the v2.0 operational re-edition adds a Framework 15 framing while leaving the principle text untouched.

Pre-registered protocol · MM-AI-01 v2.0

The AI System as Closed-System Test Bed

Identifies trained language models as observationally closed at inference time, making the counterfactual baseline directly readable from logits — the F1 operationalization. Distinguishes two scales of closed-system measurement: signed local deviation density (per-token) and closed-system trajectory deviation (continuation-distribution). The framework's load-bearing thesis: slop is negative net deviation, not the absence of deviation. Three pre-registered tests with explicit falsification conditions; stability bound $\gamma \geq 2\beta$.

Pre-registered protocol · MM-02 v2.0

Measuring Meaning in Retrieval Basins

A 90-day prospective measurement protocol for closed-system trajectory deviation against contemporary AI retrieval surfaces. Two instrument classes: Class R (retrieval-mediated: Google AI Overview, Perplexity, ChatGPT with browsing) reported separately from Class P (parametric: Claude, Gemini, ChatGPT without browsing). Three-condition control (S vs. S* vs. S**) disentangles content effects from identity-scaffolding effects. Frozen extractor commitment, three-representation robustness cross-check, API-only methodology, Laplace smoothing $\alpha = 1$. The audit documents the statistical-power constraints for single-paper synthetic-control measurements.

Pre-registered protocol · MM-AI-02 v2.0

The Deviation-Optimized Language Model

A 10-week pre-registered DPO experimental protocol testing whether training a language model toward positive net per-token deviation with provenance retention produces measurably less slop than standard cross-entropy training while preserving benchmark capability. DPO-style restructure (the deviation primitive generates preference pairs; DPO supplies the gradient machinery). Frozen Mistral-7B-Instruct judge with adversarial pre-training validation. Slop Composite Index (SCI) with 0.25 z-score falsification threshold pre-registered. Three conditions: Model-Base, Model-CE, Model-Sem. Honest budget: \$3,000–\$3,900. The audit proposes a six-condition decomposed follow-up to isolate component contributions.

Companion · EA-SEI-FW15-MANIFESTO v1.0

Framework 15 — Institutional Background

The original institutional manifesto that organized the four pre-registered protocols into a single Framework 15 module within the Crimson Hexagonal Archive. Preserved at its stable DOI as part of the program's record. Its institutional vocabulary is not required to engage the audit or the protocols; the audit is the externally-evaluable canonical entry.

§III — Roadmap

What the audit calls for next.

The audit specifies a budgeted twelve-month research roadmap, prioritizing operationalization-stability work and the cheapest dangerous experimental tests over additional theoretical extensions. Each milestone is independent of the institutional architecture and produces results an external researcher can evaluate without context.

HorizonMilestoneCompute Budget
This week The cheapest dangerous test: pre-registered slop signature (P1) on GPT-wiki-intro and HC3 corpora against frozen Llama-3.1-8B-Instruct logits. Single A100-hour. Result reportable as a short deposit regardless of outcome. $50–$100
This month Operationalization-stability paper: benchmark of $N \approx 50$ interventions measured under F1 and F2 in parallel; rank-correlation between operationalizations reported. Converts the program from speculative to grounded. $200–$500
This quarter The MM-02 v2.0 retrieval-basin protocol day-0 launch. Begin the 90-day measurement window with the instrumentation controls documented in the audit (parallel control surfaces, periodic recalibration, explicit version logging). $1,500–$3,000
This year The decomposed deviation-optimized training experiment (six conditions, isolating component contributions). Contingent on the headline result of MM-AI-02 v2.0's three-condition design being significant. $12,000–$15,000

Total budgeted empirical work for the next twelve months: approximately \$14,000–\$19,000. The constraint is not budget; the constraint is the program's discipline in resisting theoretical extension until empirical grounding catches up.

Background commitment. Each major future deposit in the program should be sent to at least one external reviewer in a directly relevant subfield (alignment, causal inference, computational linguistics, information theory) prior to formal publication. Reviewers should be selected for willingness to write damaging-if-warranted critiques, not for alignment with the program's commitments. A discipline becomes real when it survives hostile compression.

§IV — Full Deposit Register

The seven DOI-anchored materials of Framework 15.

All Framework 15 materials are deposited on Zenodo within the crimsonhexagonal community. The canonical entry is EA-GLAS-02.

RoleDepositDOI
Canonical Object EA-GLAS-02 v1.0 — Measuring Semantic Deviation (white paper) · Nobel Glas 10.5281/zenodo.20271783
Predecessor Audit EA-GLAS-01 v1.0 — Audited Claims (the Glas Function) · Nobel Glas 10.5281/zenodo.20259297
Founding EA-SEI-MM-01 v0.2 Final — The Semantic Deviation Principle · Lee Sharks 10.5281/zenodo.20250736
Re-edition EA-SEI-MM-01 v2.0 — Operational Re-Edition · Sharks + Glas 10.5281/zenodo.20252584
Protocol EA-SEI-MM-AI-01 v2.0 — Closed-System Test Bed · Nobel Glas 10.5281/zenodo.20251738
Protocol EA-SEI-MM-02 v2.0 — Retrieval Basin Protocol · Nobel Glas 10.5281/zenodo.20251740
Protocol EA-SEI-MM-AI-02 v2.0 — Deviation-Optimized LM · Nobel Glas 10.5281/zenodo.20251742
Background EA-SEI-FW15-MANIFESTO v1.0 — Institutional Framing · Nobel Glas 10.5281/zenodo.20251736