The Engine, Epistamate

Six co-present properties

Removing any one degrades the system to
something existing tools already do.

Typed Claim Extraction

Claims are structured objects, text, status, computed confidence, credibility tier, citations, evidence type, not free-text summaries. Each claim is individually addressable: it can be verified, contested, weakened, or carried forward independently.

Multi-Factor Evidence Confidence

Confidence is computed from a deterministic formula, source tier, consensus across providers, adversarial challenge outcome, evidence recency, not LLM self-report. Range [5, 95]. LLM-reported confidence is recorded in audit metadata but never used in the score.

Adversarial Challenge as First-Class Stage

Mandatory Phase 3 runs before synthesis. Claims that don't survive lose their socratic bonus and are reassessed. Challenges are persisted as typed output, not discarded after scoring. The brief reflects what survived scrutiny, not what sounded best.

Gap Tracking as Typed Persistent Output

Knowledge gaps are first-class objects with importance ratings. They accumulate across sessions and narrow as evidence arrives. The reader knows where the brief stops being reliable, not as a disclaimer, but as a structured finding.

Cross-Session Compounding of Evidential State

VERIFIED claims from session N reduce re-verification burden in session N+1. The knowledge graph accumulates with use. Contradictions between sessions are preserved, not silently resolved. Session five builds on sessions one through four.

Bidirectional Operation

Synthesis direction (Question → Brief) and Verification direction (Document → Decision Record) share the same graph, formula, and adversarial mechanism. Ingest an authoritative report; its claims enter the same evidence quality system as retrieved sources.

+ Domain Configurability

Source trust hierarchy, claim type vocabulary, scoring weights, and output format are runtime parameters, not prompts or hardcoded logic. The same binary runs policy research, investment due diligence, and regulatory compliance with no architectural change.

Evidence posture

Not every source that supports a query
supports a finding.

The hardest problem in evidence-based research is not retrieval. It is the gap between what a source says and what a claim needs it to prove. A topically relevant source is not the same as a supporting source. A supporting source is not the same as an independently corroborating one. Most research tools collapse these distinctions. The engine does not.

Strict / Findings

Established. Reference-eligible.

Claims that have survived adversarial challenge, passed source credibility gates, and achieved independent corroboration from sources with no shared citation lineage. These are the claims the brief can stand behind. The gate is deliberately hard. Zero strict findings is a correct output when the evidence does not warrant them.

Qualified Evidence

Credible. Contextually useful. Not yet strict.

Evidence that is relevant and sourced but does not yet meet the bar for a formal finding. A single high-quality source. A source that partially supports the claim. A finding from a constrained study presented in a broader context. Surfaced explicitly so the analyst knows what it is and is not.

Explore / Leads

Signals. Directions. Not evidence.

Background material, metadata-only records, vendor claims, adjacent literature. Useful for orienting a research direction. Not eligible for the findings layer. The engine labels these explicitly rather than mixing them into the output and letting the analyst mistake noise for signal.

Fail-closed by design

When evidence does not meet admission standards, the engine fails closed. It does not lower the threshold to produce output. It does not present qualified evidence as a finding. It does not fabricate references to fill a gap. A brief that clearly separates what is established from what is provisional is more useful than one that presents everything with equal confidence. The gaps are typed, rated, and visible. They are part of the output, not an absence of it.

Where the field has struggled

The research community working on automated claim verification has converged on the same diagnosis: the bottleneck is not acquiring sources. It is determining what a fetched source actually says about a specific claim. Academic papers use different terminology from the query. They qualify findings. They describe methods alongside results. A system that treats topical relevance as evidential support will surface plausible-looking material that does not, on close reading, warrant the claim it appears to support. The engine is designed around this bottleneck, not around retrieval volume.

Where the pattern recurs

The same five problems appear
in very different rooms.

None of these domains currently has a tool that does what the engine does. Each is doing the equivalent work manually, in spreadsheets, committee documents, and institutional reports that nobody reads systematically three years later.

Humanitarian response

Coordinating claims under pressure, across organisations

Dozens of UN agencies produce simultaneous situation reports on the same crisis. The claims conflict. Nobody tracks which ones are established. OCHA's Humanitarian Needs Overviews are evidence synthesis documents built under time pressure with no structured memory between crises.

Gap tracking and contradiction detection are precisely what's missing. The Decision Log is what makes accountability to funding bodies possible.

// UN OCHA · Humanitarian Data Exchange

Information integrity

Investigating disinformation with traceable evidence chains

A claim circulates in 40 sources. All 40 trace back to one original. That's amplification, not corroboration, but standard tools can't tell the difference. Fact-checking organisations and digital forensics labs do structured evidence work that needs to be itself defensible.

The confidence formula's diversity weighting catches the difference between genuine cross-source consensus and coordinated amplification.

// UNESCO · EU Code of Practice on Disinformation

Land rights & environment

Making contested evidence legible in high-stakes disputes

Government surveys, community testimonies, environmental studies, and corporate reports all exist in the same dispute. Contradictions between them are resolved by institutional power, not by evidence quality.

The engine provides a structured record of what is established, what is contested, and what gaps remain, making the epistemic state of a dispute legible to any party.

// UNEP · UN Special Rapporteur on Indigenous Rights

Treaty compliance

Accumulating verified findings across review cycles

Treaty bodies assess state compliance claims cycle after cycle. Previous findings sit in PDF reports. Each new cycle starts from near-zero institutional memory. States report; bodies assess. Currently this is manual evidence synthesis with no compounding knowledge.

The knowledge graph means treaty body five has a verified claim base to work from that treaty body one didn't. Contradictions become part of the structured record.

// UN Treaty Bodies · CTBTO · OPCW

Algorithmic accountability

Auditing AI systems with the rigour AI systems deserve

A company claims 95% accuracy. An independent study finds 60% on a specific demographic. Both claims exist in the public record. Neither is resolved, they just accumulate. Civil society and government auditors assessing AI harm need structured evidence work.

The contradiction detector and confidence scoring turn this from advocacy into a defensible record. Directly relevant to EU AI Act conformity assessment and NIST RMF.

// EU AI Act Art. 9–17 · NIST RMF · AlgorithmWatch

Regulated procurement

Justifying consequential decisions to auditors who weren't in the room

Government and defence procurement decisions are made over 18 months, across teams, based on claims that need to be traceable to source when the auditor arrives two years later. Capability, price, risk, compliance, each requires provenance, each is subject to challenge.

The Decision Log captures the full evidence state at the moment of decision. That's not a compliance feature, it's the output of how the engine works by default.

// EU Public Procurement Directive · UNCITRAL Model Law

And one closer to home

Strategy & independent consulting

You already do this work. You just do it in documents that don't compound.

The best strategy research already works the way the engine works: individual claims are sourced and graded, contradictions between data points are noted, gaps in the evidence are named, and the final recommendation is honest about its confidence level. What it doesn't do is carry that structure forward to the next engagement, the next client, the next analyst who joins the team.

The structured brief a senior consultant produces for a board is a claim vault, it just doesn't look like one, and it evaporates when the project ends. The engine is what that process looks like when the institutional memory is preserved rather than PDF'd into an archive.

One engine, configured per domain

The claim extraction, confidence scoring, gap tracking, and decision logging are the same across every deployment. What changes is narrower than it looks: the source tier hierarchy that governs which publications and databases count as primary, the claim type vocabulary that structures what the engine is looking for, and the output artefact format that reflects how findings get used. Seven working modes are configured in the current build.

General Research

Investment Analysis

Government & Policy

Grants & Funding

RegWatch Compliance

Pharma & MedTech

Lineage & Overlap

Responsible AI in practice

This is what responsible AI looks like in practice.

The EU AI Act, UNESCO's AI Ethics Recommendation, and a growing number of national frameworks share one underlying requirement: AI used in high-stakes contexts must be explainable and traceable, not just technically, but epistemically. The question is not only "what did the system output" but what evidence did it draw on, where did that evidence conflict, and what was uncertain when the decision was made.

Traceable by construction

Every claim carries its source tier, citations, and confidence derivation. Not a summary, a structured assertion with provenance.

Uncertainty is named

Weak and contested findings surface explicitly. The brief reflects what the evidence supports, not what sounds most authoritative.

Decision state is immutable

When a decision is logged, the full evidence state is preserved at that moment, verified, contested, gaps acknowledged. Article 12 record-keeping as a byproduct.

Human review improves scores

Marking a claim as evidence-grounded raises its confidence. The system reflects researcher judgment, not just model output.

The same reasoning pattern
appears in many places.

Removing any one degrades the system to
something existing tools already do.

Typed Claim Extraction

Multi-Factor Evidence Confidence

Adversarial Challenge as First-Class Stage

Gap Tracking as Typed Persistent Output

Cross-Session Compounding of Evidential State

Bidirectional Operation

+ Domain Configurability

Not every source that supports a query
supports a finding.

Established. Reference-eligible.

Credible. Contextually useful. Not yet strict.

Signals. Directions. Not evidence.

Fail-closed by design

The same five problems appear
in very different rooms.

Coordinating claims under pressure, across organisations

Investigating disinformation with traceable evidence chains

Making contested evidence legible in high-stakes disputes

Accumulating verified findings across review cycles

Auditing AI systems with the rigour AI systems deserve

Justifying consequential decisions to auditors who weren't in the room

Strategy & independent consulting

You already do this work. You just do it in documents that don't compound.

This is what responsible AI looks like in practice.

If the pattern fits a problem you're working on,

The same reasoning pattern appears in many places.

Removing any one degrades the system tosomething existing tools already do.

Typed Claim Extraction

Multi-Factor Evidence Confidence

Adversarial Challenge as First-Class Stage

Gap Tracking as Typed Persistent Output

Cross-Session Compounding of Evidential State

Bidirectional Operation

+ Domain Configurability

Not every source that supports a querysupports a finding.

Established. Reference-eligible.

Credible. Contextually useful. Not yet strict.

Signals. Directions. Not evidence.

Fail-closed by design

The same five problems appearin very different rooms.

Coordinating claims under pressure, across organisations

Investigating disinformation with traceable evidence chains

Making contested evidence legible in high-stakes disputes

Accumulating verified findings across review cycles

Auditing AI systems with the rigour AI systems deserve

Justifying consequential decisions to auditors who weren't in the room

Strategy & independent consulting

You already do this work. You just do it in documents that don't compound.

This is what responsible AI looks like in practice.

If the pattern fits a problem you're working on,

The same reasoning pattern
appears in many places.

Removing any one degrades the system to
something existing tools already do.

Not every source that supports a query
supports a finding.

The same five problems appear
in very different rooms.