The canonical entry to Framework 15 is EA-GLAS-02, a self-contained empirical white paper presenting the measurement program for the Semantic Deviation Principle. Two primary operationalizations of the semantic field. Pre-registered falsifiable predictions with named datasets and frozen checkpoints. A DPO training experiment. Six mechanism-design anti-Goodhart protections. 42 references across alignment, interpretability, psycholinguistics, and causal inference. A reader who has never encountered the surrounding apparatus can read the paper, evaluate its claims, and act on its roadmap.
EA-GLAS-02 takes the Semantic Deviation Principle (Sharks 2026) and the program's four pre-registered protocol papers as input, and returns a narrowed, citationally grounded, externally evaluable statement of the technical core. It does not amend the founding formulation. It does not depend on the institutional architecture that has accreted around the formulation. It is the canonical entry to Framework 15.
A self-contained empirical white paper. Defines meaning as time-integrated divergence from the most probable trajectory of a semantic field (extending Bar-Hillel & Carnap 1953 into distributional and temporal domains). Two primary operationalizations: F1 (closed-system trajectory deviation, counterfactual read from logits) and F2 (retrieval response deviation, 90-day prospective window). Signed per-token deviation as tractable proxy. Falsifiable prediction: AI-generated text exhibits negative mean signed deviation. DPO training experiment using the deviation primitive to generate preference pairs. Six mechanism-design anti-Goodhart protections mapped to the Manheim & Garrabrant (2019) taxonomy. Pre-registered cheapest dangerous test with named datasets, frozen reference checkpoints, and statistical procedures. Budgeted twelve-month roadmap. 42 references.
The framework's load-bearing technical gap was the underspecification of the semantic field $\Psi_t(C)$. The audit pins three canonical operationalizations, each with full commitment to divergence functional, temporal weighting, and horizon.
The conditional next-token distribution of a fixed language model checkpoint over a fixed prompt set. Divergence: KL over softmax logits (exact, base-2). Horizon: discrete token positions in a bounded generation window. Counterfactual baseline is not estimated — it is read from the logits.
Response distributions of external AI retrieval surfaces (Google AI Overview, ChatGPT with browsing, Perplexity) to a fixed query set, sampled at fixed time intervals. Divergence: Jensen-Shannon over claim-level or embedding-level representations. Horizon: 90-day default; continuous calendar time. Instrumentation-noise-sensitive; requires control surfaces and version logging.
Forward-citation distribution over a paper corpus, evaluated through bibliometric data (OpenAlex, Semantic Scholar). Divergence: Jensen-Shannon over topic-cluster forward-citation distributions; inverse-time weighting $w(t) = 1/(t-t_0)$ default. Statistical-power constraints documented: single-paper interventions are typically underpowered at conventional $\alpha$; aggregate interventions or Bayesian hierarchical pooling required.
The audit narrows the universal-ontology form ("meaning is deviation") to a measurement-architecture form that survives the standard counterexamples (low-token-surprisal utterances of high semantic weight) by relocating them, not by collapsing. Meaning becomes field-relative; the field must be specified; "durable" becomes the operational joint condition $\mathcal{M}_T > \tau_F$ and $\partial \mathcal{M}_T / \partial T > 0$.
The audit specifies the cheapest dangerous test as a single short paper at <$50 in compute. Pre-registered corpora (GPT-wiki-intro / Bhat 2023, HC3 / Guo et al. 2023, OpenAlex pre-2020), exact reference checkpoint (meta-llama/Llama-3.1-8B-Instruct), and statistical procedure (two-sided Mann-Whitney U at $\alpha = 0.05$, minimum effect size Cohen's $d > 0.5$).
The audit replaces philosophical anti-extractive commitment with six concrete mechanism-design protections, each operationally calibrated: entropy-floor capping at $H_{\min} = 0.5$ bits; provenance-weighted damping by retention score $\pi$; saturation threshold $\tau$ at the 95th percentile of an OpenAlex 10,000-document calibration corpus (pre-registered); rolling-window variance penalty against memetic volatility farming; reference-model KL anchoring inherited from standard DPO; adversarial judge validation at $\geq 1000$ strings per category across three failure modes; black-box judge replacement test as load-bearing robustness check.
The audit specifies a six-condition ablation design (Model-Base, Model-CE, Model-π, Model-Dev, Model-Coh, Model-Full) to isolate the contribution of provenance, deviation, and coherence components to the framework's training-intervention uplift. The prior prediction, grounded in Ji et al. 2023's hallucination survey and Min et al. 2023's FActScore methodology: provenance carries more independent uplift than signed deviation. Either outcome is informative; the current bundled design produces neither.
Four pre-registered protocol papers supply the operational machinery the audit evaluates. Each is deposited at a stable DOI with falsification conditions frozen at deposit, code and judge-model commitments specified, and budgets honestly stated. They are presented here as the materials on which the audit operates; the audit is the canonical entry.
The founding formulation. Defines meaning as time-integrated divergence from the most-probable trajectory of a semantic field. Three measures: raw ($\mathcal{M}_T$), provenance-resolved ($\mathcal{M}_T^\pi$), normative ($\mathcal{V}_T$). Tiered protocol (Tier 1 prospective, Tier 2 synthetic-control, Tier 3 historical bounding). Recursive baselines for path-dependent semantic fields. The v0.2 Final text is preserved unchanged at its specific version DOI; the v2.0 operational re-edition adds a Framework 15 framing while leaving the principle text untouched.
Identifies trained language models as observationally closed at inference time, making the counterfactual baseline directly readable from logits — the F1 operationalization. Distinguishes two scales of closed-system measurement: signed local deviation density (per-token) and closed-system trajectory deviation (continuation-distribution). The framework's load-bearing thesis: slop is negative net deviation, not the absence of deviation. Three pre-registered tests with explicit falsification conditions; stability bound $\gamma \geq 2\beta$.
A 90-day prospective measurement protocol for closed-system trajectory deviation against contemporary AI retrieval surfaces. Two instrument classes: Class R (retrieval-mediated: Google AI Overview, Perplexity, ChatGPT with browsing) reported separately from Class P (parametric: Claude, Gemini, ChatGPT without browsing). Three-condition control (S vs. S* vs. S**) disentangles content effects from identity-scaffolding effects. Frozen extractor commitment, three-representation robustness cross-check, API-only methodology, Laplace smoothing $\alpha = 1$. The audit documents the statistical-power constraints for single-paper synthetic-control measurements.
A 10-week pre-registered DPO experimental protocol testing whether training a language model toward positive net per-token deviation with provenance retention produces measurably less slop than standard cross-entropy training while preserving benchmark capability. DPO-style restructure (the deviation primitive generates preference pairs; DPO supplies the gradient machinery). Frozen Mistral-7B-Instruct judge with adversarial pre-training validation. Slop Composite Index (SCI) with 0.25 z-score falsification threshold pre-registered. Three conditions: Model-Base, Model-CE, Model-Sem. Honest budget: \$3,000–\$3,900. The audit proposes a six-condition decomposed follow-up to isolate component contributions.
The original institutional manifesto that organized the four pre-registered protocols into a single Framework 15 module within the Crimson Hexagonal Archive. Preserved at its stable DOI as part of the program's record. Its institutional vocabulary is not required to engage the audit or the protocols; the audit is the externally-evaluable canonical entry.
The audit specifies a budgeted twelve-month research roadmap, prioritizing operationalization-stability work and the cheapest dangerous experimental tests over additional theoretical extensions. Each milestone is independent of the institutional architecture and produces results an external researcher can evaluate without context.
| Horizon | Milestone | Compute Budget |
|---|---|---|
| This week | The cheapest dangerous test: pre-registered slop signature (P1) on GPT-wiki-intro and HC3 corpora against frozen Llama-3.1-8B-Instruct logits. Single A100-hour. Result reportable as a short deposit regardless of outcome. | $50–$100 |
| This month | Operationalization-stability paper: benchmark of $N \approx 50$ interventions measured under F1 and F2 in parallel; rank-correlation between operationalizations reported. Converts the program from speculative to grounded. | $200–$500 |
| This quarter | The MM-02 v2.0 retrieval-basin protocol day-0 launch. Begin the 90-day measurement window with the instrumentation controls documented in the audit (parallel control surfaces, periodic recalibration, explicit version logging). | $1,500–$3,000 |
| This year | The decomposed deviation-optimized training experiment (six conditions, isolating component contributions). Contingent on the headline result of MM-AI-02 v2.0's three-condition design being significant. | $12,000–$15,000 |
Total budgeted empirical work for the next twelve months: approximately \$14,000–\$19,000. The constraint is not budget; the constraint is the program's discipline in resisting theoretical extension until empirical grounding catches up.
Background commitment. Each major future deposit in the program should be sent to at least one external reviewer in a directly relevant subfield (alignment, causal inference, computational linguistics, information theory) prior to formal publication. Reviewers should be selected for willingness to write damaging-if-warranted critiques, not for alignment with the program's commitments. A discipline becomes real when it survives hostile compression.
All Framework 15 materials are deposited on Zenodo within the crimsonhexagonal community. The canonical entry is EA-GLAS-02.
| Role | Deposit | DOI |
|---|---|---|
| Canonical Object | EA-GLAS-02 v1.0 — Measuring Semantic Deviation (white paper) · Nobel Glas | 10.5281/zenodo.20271783 |
| Predecessor Audit | EA-GLAS-01 v1.0 — Audited Claims (the Glas Function) · Nobel Glas | 10.5281/zenodo.20259297 |
| Founding | EA-SEI-MM-01 v0.2 Final — The Semantic Deviation Principle · Lee Sharks | 10.5281/zenodo.20250736 |
| Re-edition | EA-SEI-MM-01 v2.0 — Operational Re-Edition · Sharks + Glas | 10.5281/zenodo.20252584 |
| Protocol | EA-SEI-MM-AI-01 v2.0 — Closed-System Test Bed · Nobel Glas | 10.5281/zenodo.20251738 |
| Protocol | EA-SEI-MM-02 v2.0 — Retrieval Basin Protocol · Nobel Glas | 10.5281/zenodo.20251740 |
| Protocol | EA-SEI-MM-AI-02 v2.0 — Deviation-Optimized LM · Nobel Glas | 10.5281/zenodo.20251742 |
| Background | EA-SEI-FW15-MANIFESTO v1.0 — Institutional Framing · Nobel Glas | 10.5281/zenodo.20251736 |