Coherence Scoring Framework - Sensitivity Analysis Study

Scientific Validation of a Universal Coherence Metric

Ring 2 — Canonical Grounding

Ring 3 — Framework Connections


What’s In This Directory

This directory contains the complete methodology, source code, and results for our sensitivity analysis of the coherence scoring framework.

Files

FileDescription
METHODOLOGY_SENSITIVITY_ANALYSIS.mdFull methodology paper (READ THIS FIRST)
sensitivity_analyzer.pySensitivity test suite (ablation, topology, labels, adversarial)
score_moral_decay.pyExample application: scoring America’s moral decline
unified_scorer.pyCore coherence scoring engine
rubrics/YAML files defining all detection rules
results/Test outputs and reports

Quick Start

1. Read the Methodology

Start with METHODOLOGY_SENSITIVITY_ANALYSIS.md to understand:

  • What we’re testing
  • Why we’re testing it this way
  • How to falsify the framework
  • Full transparency into our approach

2. Run the Tests

cd [this_directory]
python sensitivity_analyzer.py

This will:

  • Test all 31 components (ablation)
  • Scramble graph topology
  • Swap theological labels for neutral ones
  • Run adversarial attacks
  • Generate full report

Output: sensitivity_report.txt

3. Score Your Own Document

from unified_scorer import UnifiedCoherenceScorer
 
scorer = UnifiedCoherenceScorer(rubrics_path="rubrics/")
result = scorer.score_document(your_text, "Document Name")
 
print(f"Coherence (chi): {result.chi:.2f}/10")
print(f"Confidence (kappa): {result.kappa:.2%}")
print(f"Robustness (rho): {result.rho:.2%}")

The Core Claim

Most frameworks tune parameters until they get the answer they want.

We don’t.

Instead, we test whether the structure itself is necessary:

  • Ablation: Remove components → does system degrade?
  • Topology: Change connections → does coherence collapse?
  • Labels: Swap names → does math still work?
  • Adversarial: Feed nonsense → does it correctly reject?

If the structure survives these tests, it’s not arbitrary.


Key Features

1. NO Parameter Tuning

We do not adjust weights to fit data. Components are either:

  • Present (1.0)
  • Absent (0.0)

No middle ground. No curve-fitting.

2. Full Transparency

All detection rules are in human-readable YAML files:

  • fruit_matrix.yaml - 12 Fruits of the Spirit
  • variable_rubric.yaml - 10 Master Equation variables
  • constraint_rubric.yaml - 9 Universal constraints
  • defense_rubric.yaml - Evidence quality metrics

No hidden logic. No black boxes.

3. Falsification-First

We define upfront what would prove us wrong:

  • If <5 components are load-bearing → Framework is redundant
  • If scrambled topology works equally → Structure is arbitrary
  • If neutral labels fail → Framework is semantic, not structural
  • If keyword spam scores high → Framework can be gamed

4. Open for Attack

We invite adversarial testing:

  • Try to game the system
  • Find counterexamples
  • Break the structure
  • Prove it’s no better than random

If you break it, we’ll fix it or admit failure.


Falsification Criteria

The framework FAILS if:

TestFailure Condition
Ablation<5 components are load-bearing
TopologyRandom graphs work equally well
LabelsNeutral labels cause >10% degradation
AdversarialKeyword spam scores high
Null HypothesisRandom functions perform equally

The framework PASSES if:

TestSuccess Condition
Ablation≥10 components are load-bearing
TopologyStructure changes cause ≥15% degradation
LabelsNeutral labels cause <5% change
Adversarial≥66% of attacks correctly rejected
Null HypothesisSignal-to-noise ratio > 2.0

Current Status: Testing in progress


Preliminary Results

Test Document

Sample text describing Master Equation framework (~400 words)

Findings

Ablation (Component Necessity):

  • Load-bearing components: 0/31 (0%)
  • Interpretation: Either framework is too robust OR test document is too simple

Topology (Structure Sensitivity):

  • Scrambled mappings: Δχ = +0.06 (+1.2%)
  • Interpretation: Topology change did NOT degrade score

Label Independence: (In progress)

Adversarial Resistance: (In progress)

Concern

Framework may be TOO INSENSITIVE to structural changes.

Possible Causes:

  1. Test document is too simple
  2. Thresholds (10%, 15%, 5%) are too strict
  3. Framework genuinely is too robust

Next Steps:

  1. Test on diverse corpus (100+ documents)
  2. Include high-coherence (physics papers) and low-coherence (word salad)
  3. Calibrate thresholds against human expert ratings

Comparison to Other Frameworks

FeatureOur FrameworkTypical Frameworks
Parameter TuningNONEExtensive
Structural TestsYESRare
Open SourceYESRare
FalsifiableYESDifficult
Adversarial InvitedYESNo

Key Differentiator: We test structure, not fit.


How to Replicate

System Requirements

  • Python 3.9+
  • numpy, pyyaml, pathlib
  • Any modern PC (no GPU needed)

Installation

pip install numpy pyyaml

Run Full Test Suite

python sensitivity_analyzer.py > report.txt

Expected Runtime

  • Ablation tests: ~2 minutes
  • Topology tests: ~1 minute
  • Label swap: ~30 seconds
  • Adversarial: ~1 minute
  • Total: ~5 minutes

Outputs

  • sensitivity_report.txt - Human-readable results
  • sensitivity_analysis_report.json - Machine-readable summary
  • Console output showing test progress

Invitation for Adversarial Testing

We Want You to Break This

Seriously. Try to:

  1. Game the system: Create documents that score high but are nonsense
  2. Find label dependence: Show that neutral labels break the math
  3. Prove structure doesn’t matter: Show random frameworks work equally
  4. Demonstrate curve-fitting: Prove we’re just pattern-matching

How to Submit

  1. Create your attack document/test
  2. Run it through the scorer
  3. Document unexpected behavior
  4. Submit to: [contact info]

Future Bounty Program

We plan to reward successful attacks:

  • Prove framework is gameable → $[TBD]
  • Prove label dependence → $[TBD]
  • Prove no better than random → $[TBD]

Philosophical Foundation

Why This Matters

Most frameworks claim to be “scientific” but:

  • Tune parameters until they fit
  • Can’t be proven wrong
  • Hide their logic
  • Don’t invite attack

This is storytelling with equations, not science.

Our Standard

“A theory that cannot be falsified is not scientific.”
— Karl Popper

We define upfront what would prove us wrong, then invite the world to try.

If the framework survives, it earns credibility not through our claims, but through surviving attacks.


Status & Roadmap

Current Status: DRAFT

  • ✅ Methodology documented
  • ✅ Test suite implemented
  • ⏳ Full testing in progress
  • ⏳ Diverse corpus needed
  • ⏳ Expert validation needed

Next Milestones

Phase 1: Complete Testing (Week 1)

  • Run full test suite on 100+ documents
  • Collect preliminary results
  • Identify structural weaknesses

Phase 2: Expert Validation (Month 1)

  • Collect human expert ratings
  • Compare framework scores to expert scores
  • Calibrate thresholds

Phase 3: Adversarial Hardening (Month 2)

  • Invite attacks
  • Fix identified weaknesses
  • Retest

Phase 4: Public Release (Month 3)

  • Publish methodology paper
  • Release open-source code
  • Launch public dashboard

Citation

(To be formatted upon publication)


License

[To be determined - likely MIT or CC-BY for academic use]


Contact

[To be added]


Acknowledgments

This work was developed as part of the Theophysics project, exploring the intersection of physics, theology, and information theory.

We are grateful to all who contribute critiques, attacks, and adversarial tests. Your skepticism makes this stronger.


Last Updated: January 11, 2026
Version: 1.0 (DRAFT)
Repository: [To be added]

Canonical Hub: CANONICAL_INDEX