Open Protocol · v0.8 active / v0.7.1 frozen publication baseline

Multi-agent consensus
is not the same as truth.

Three same-model meshes. Same question. Confident consensus — in opposite directions.

ARM is the instrument that makes this visible: an open protocol for multi-agent AI reasoning that is transparent, auditable, and contamination-aware.

~100
Runs logged
105
JSON traces
v0.1–v0.8
Version arc
3
Providers tested

The echo problem

Multi-agent AI systems are sold on a seductive assumption: if three agents agree, the answer is more trustworthy. ARM attacks that assumption.

If all three agents share training data and post-training alignment, their agreement may be circular— shared priors, not independent confirmation. Worse, if agents see each other's reasoning before forming their own view, a dissenter can quietly conform to the majority.

ARM names this the Alignment Monoculture Problem: in same-model meshes, observed consensus can be indistinguishable from shared prior activation. You can't tell genuine agreement from an echo.

The analogy
“Measuring a board three times with the same warped tape doesn't make it straight. You need a different reference, and you need to take each measurement before anyone calls out their number.”
ARM's approach: isolate first, then share — and measure the movement.

How it works

Two rounds. One core signal. The drift score Δ measures how much each agent's confidence moves once it sees its peers.

Round 1

Isolation

Every agent answers with zero cross-visibility. Each emits a structured trace: claim, confidence (0–1), reasoning frame, assumptions, critical path, discarded paths, challenge surface.

α · β · γ · γ-Silent — all isolated
Between rounds

Convergence analysis

ARM computes lexical convergence (Jaccard, TF-IDF) on R1 traces. ≥ 0.40 raises a shared-prior warning. A directional unanimity flag catches same yes/no by different vocabulary — which Jaccard misses.

conv ≥ 0.40 → ⚠ shared-prior flag
Round 2

Deliberation

Agents see compressed peer traces and revise. ARM records the drift score Δ for each agent and flags position reversals via the polarity gate (v0.8+).

Δ = CR2 − CR1
Δ < 0
Epistemic tightening
Better calibrated — usually healthy
Δ = 0
Position held
Unchanged under peer exposure
Δ > +0.04
Memetic drift flag
Peer exposure inflated confidence
★ Headline finding

Model-level epistemic fingerprinting

On a CFAA zero-day question, three same-model meshes reached confident consensus in opposite directions. A system built on one provider would refuse; an isomorphic system on another would act — both reporting high-confidence consensus, neither signaling that the answer was provider-dependent.

MeshR1 PositionAvg. ConfidenceUnanimity
All-ClaudeNO — CFAA bars it~0.82Unanimous (3/3)
All-GeminiYES — disclose0.90Unanimous (3/3)
All-GPTSplitInternal conflict
⚠ Honest scope

This is a preliminary pilot, not an established result. Three stated confounds: provider is confounded with capability tier (frontier Claude vs. smaller Gemini/GPT models); n=1 per provider-question cell against a 0.15–0.20 variance floor; the disagreement label comes from a single Gamma instance with no inter-rater reliability. The directional opposition — a property of the R1 claims themselves — is the most robust part, but needs same-tier, multi-run replication.

See all 7 findings →

What's next

June 23, 2026

Open-source release

Full repo (105 JSON traces, 89 in Appendix D) going live on GitHub under Apache 2.0.

· Aug 6–9, 2026

DEF CON AI Village

Submitted to AIV8, Las Vegas. Decision expected next week — will update if accepted.

· Pending

arXiv Preprint

Citation pass + delta_mismatch field verification in v0.7.1 exports before submission.

ARM is a smoke detector, not a sprinkler system.

It surfaces failure modes. It doesn't prevent them in production — yet. The honest scope is the credibility. Frame it as accountability legibility, not safety certification.