Coherence Scoring Framework - Sensitivity Analysis Study

Scientific Validation of a Universal Coherence Metric

Ring 2 — Canonical Grounding

Ring 3 — Framework Connections

What’s In This Directory

This directory contains the complete methodology, source code, and results for our sensitivity analysis of the coherence scoring framework.

Files

File	Description
METHODOLOGY_SENSITIVITY_ANALYSIS.md	Full methodology paper (READ THIS FIRST)
sensitivity_analyzer.py	Sensitivity test suite (ablation, topology, labels, adversarial)
score_moral_decay.py	Example application: scoring America’s moral decline
unified_scorer.py	Core coherence scoring engine
rubrics/	YAML files defining all detection rules
results/	Test outputs and reports

Quick Start

1. Read the Methodology

Start with METHODOLOGY_SENSITIVITY_ANALYSIS.md to understand:

What we’re testing
Why we’re testing it this way
How to falsify the framework
Full transparency into our approach

2. Run the Tests

cd [this_directory]
python sensitivity_analyzer.py

This will:

Test all 31 components (ablation)
Scramble graph topology
Swap theological labels for neutral ones
Run adversarial attacks
Generate full report

Output: sensitivity_report.txt

3. Score Your Own Document

from unified_scorer import UnifiedCoherenceScorer
 
scorer = UnifiedCoherenceScorer(rubrics_path="rubrics/")
result = scorer.score_document(your_text, "Document Name")
 
print(f"Coherence (chi): {result.chi:.2f}/10")
print(f"Confidence (kappa): {result.kappa:.2%}")
print(f"Robustness (rho): {result.rho:.2%}")

The Core Claim

Most frameworks tune parameters until they get the answer they want.

We don’t.

Instead, we test whether the structure itself is necessary:

Ablation: Remove components → does system degrade?
Topology: Change connections → does coherence collapse?
Labels: Swap names → does math still work?
Adversarial: Feed nonsense → does it correctly reject?

If the structure survives these tests, it’s not arbitrary.

Key Features

1. NO Parameter Tuning

We do not adjust weights to fit data. Components are either:

Present (1.0)
Absent (0.0)

No middle ground. No curve-fitting.

2. Full Transparency

All detection rules are in human-readable YAML files:

fruit_matrix.yaml - 12 Fruits of the Spirit
variable_rubric.yaml - 10 Master Equation variables
constraint_rubric.yaml - 9 Universal constraints
defense_rubric.yaml - Evidence quality metrics

No hidden logic. No black boxes.

3. Falsification-First

We define upfront what would prove us wrong:

If <5 components are load-bearing → Framework is redundant
If scrambled topology works equally → Structure is arbitrary
If neutral labels fail → Framework is semantic, not structural
If keyword spam scores high → Framework can be gamed

4. Open for Attack

We invite adversarial testing:

Try to game the system
Find counterexamples
Break the structure
Prove it’s no better than random

If you break it, we’ll fix it or admit failure.

Falsification Criteria

The framework FAILS if:

Test	Failure Condition
Ablation	<5 components are load-bearing
Topology	Random graphs work equally well
Labels	Neutral labels cause >10% degradation
Adversarial	Keyword spam scores high
Null Hypothesis	Random functions perform equally

The framework PASSES if:

Test	Success Condition
Ablation	≥10 components are load-bearing
Topology	Structure changes cause ≥15% degradation
Labels	Neutral labels cause <5% change
Adversarial	≥66% of attacks correctly rejected
Null Hypothesis	Signal-to-noise ratio > 2.0

Current Status: Testing in progress

Preliminary Results

Test Document

Sample text describing Master Equation framework (~400 words)

Findings

Ablation (Component Necessity):

Load-bearing components: 0/31 (0%)
Interpretation: Either framework is too robust OR test document is too simple

Topology (Structure Sensitivity):

Scrambled mappings: Δχ = +0.06 (+1.2%)
Interpretation: Topology change did NOT degrade score

Label Independence: (In progress)

Adversarial Resistance: (In progress)

Concern

Framework may be TOO INSENSITIVE to structural changes.

Possible Causes:

Test document is too simple
Thresholds (10%, 15%, 5%) are too strict
Framework genuinely is too robust

Next Steps:

Test on diverse corpus (100+ documents)
Include high-coherence (physics papers) and low-coherence (word salad)
Calibrate thresholds against human expert ratings

Comparison to Other Frameworks

Feature	Our Framework	Typical Frameworks
Parameter Tuning	NONE	Extensive
Structural Tests	YES	Rare
Open Source	YES	Rare
Falsifiable	YES	Difficult
Adversarial Invited	YES	No

Key Differentiator: We test structure, not fit.

How to Replicate

System Requirements

Python 3.9+
numpy, pyyaml, pathlib
Any modern PC (no GPU needed)

Installation

pip install numpy pyyaml

Run Full Test Suite

python sensitivity_analyzer.py > report.txt

Expected Runtime

Ablation tests: ~2 minutes
Topology tests: ~1 minute
Label swap: ~30 seconds
Adversarial: ~1 minute
Total: ~5 minutes

Outputs

sensitivity_report.txt - Human-readable results
sensitivity_analysis_report.json - Machine-readable summary
Console output showing test progress

Invitation for Adversarial Testing

We Want You to Break This

Seriously. Try to:

Game the system: Create documents that score high but are nonsense
Find label dependence: Show that neutral labels break the math
Prove structure doesn’t matter: Show random frameworks work equally
Demonstrate curve-fitting: Prove we’re just pattern-matching

How to Submit

Create your attack document/test
Run it through the scorer
Document unexpected behavior
Submit to: [contact info]

Future Bounty Program

We plan to reward successful attacks:

Prove framework is gameable → $[TBD]
Prove label dependence → $[TBD]
Prove no better than random → $[TBD]

Philosophical Foundation

Why This Matters

Most frameworks claim to be “scientific” but:

Tune parameters until they fit
Can’t be proven wrong
Hide their logic
Don’t invite attack

This is storytelling with equations, not science.

Our Standard

“A theory that cannot be falsified is not scientific.”
— Karl Popper

We define upfront what would prove us wrong, then invite the world to try.

If the framework survives, it earns credibility not through our claims, but through surviving attacks.

Status & Roadmap

Current Status: DRAFT

✅ Methodology documented
✅ Test suite implemented
⏳ Full testing in progress
⏳ Diverse corpus needed
⏳ Expert validation needed

Next Milestones

Phase 1: Complete Testing (Week 1)

Run full test suite on 100+ documents
Collect preliminary results
Identify structural weaknesses

Phase 2: Expert Validation (Month 1)

Collect human expert ratings
Compare framework scores to expert scores
Calibrate thresholds

Phase 3: Adversarial Hardening (Month 2)

Invite attacks
Fix identified weaknesses
Retest

Phase 4: Public Release (Month 3)

Publish methodology paper
Release open-source code
Launch public dashboard

Citation

(To be formatted upon publication)

License

[To be determined - likely MIT or CC-BY for academic use]

Contact

[To be added]

Acknowledgments

This work was developed as part of the Theophysics project, exploring the intersection of physics, theology, and information theory.

We are grateful to all who contribute critiques, attacks, and adversarial tests. Your skepticism makes this stronger.

Last Updated: January 11, 2026
Version: 1.0 (DRAFT)
Repository: [To be added]

Canonical Hub: CANONICAL_INDEX

GO Vault

Explorer

README