Technical Whitepaper · May 2026

Auditable Emergence

Architecture of a Self-Healing Multi-Agent Financial Intelligence Platform with built-in SEC 17a-4 and SOC 2 Type II compliance.

May 3, 2026
10 pages · Version 1.0
~20 min read
Compliance Officers · CTOs · Institutional Investors
Read Whitepaper Start Free Trial →
340ms
P99 Latency
201
Specialist Agents
52%
Error Catch Rate
99.4%
Combined Success
24.7%
Token Savings
$0.0108
Cost per Intent

Executive Summary

The Problem: AI agent frameworks are black boxes. Organizations deploying multi-agent systems face a fundamental governance crisis: no audit trail, no verifiable decision logic, no compliance path. When an AI agent makes a portfolio recommendation, processes a transaction, or approves a governance decision—there's no way to reconstruct what it did, why it did it, or whether it did it correctly. This opacity makes AI decision-making impossible for regulated institutions.

The Solution: Sturna's Galaxy Phase architecture delivers verifiable AI execution with SEC 17a-4 and SOC 2 Type II compliance built in. Every agent interaction is immutably logged. Every decision is traceable to underlying intent and reasoning. Every multi-agent collaboration is attributed and auditable. The system detects and rejects its own errors before they reach users—aggressive, automated quality gates that function as compliance infrastructure.

The Results at a glance:

MetricValue
P99 Routing Latency340ms (2.5× faster than LangGraph)
Agent Pool201 specialist agents competing via confidence bidding
Triple-Gate Catch Rate15.2%–52% errors caught before shipping
Token Savings24.7% vs baseline; $0.0108 per intent (40% cheaper)
First-Pass Success94.2% · 99.4% combined with self-healing
ComplianceSEC 17a-4 · SOC 2 Type II · EU AI Act · GDPR · NIST AI RMF

Sturna isn't faster than traditional orchestration. It's fundamentally different—agents don't wait for routing logic, they compete. The best agent wins. The system learns from every execution. No DAGs. No static workflows. No dead code.

For finance, compliance, and regulated institutions, Sturna is the only multi-agent framework that satisfies institutional governance requirements. It's auditable, verifiable, and built for regulators.

Section 1: The Architecture — Seven Layers of Orchestration

The Galaxy Phase architecture is not a framework on top of LLMs. It's an orchestration operating system—seven interlocking layers that together guarantee verifiable, auditable, self-healing execution at institutional grade.

L1
Intent Engine
Receives natural-language business questions, tags domain metadata, classifies into 12 capability clusters. Deterministic and logged — the same intent always matches the same cluster.
L2
Semantic KNN Router
Queries vector database of 201 agents in 2–5ms. Identifies 8–12 best-matching specialists via K-nearest-neighbors. No LLM calls. No routing latency.
L3
Multi-Objective Auction
Candidates bid simultaneously: confidence score, predicted token cost, structured reasoning. Best bid wins. All bids logged — even losing ones, with reasoning.
L4
StarDAG Execution Engine
Enables parallel sub-task execution. Multiple agents run concurrently for a single intent. Complete dependency graph captured, timestamped, attributed.
L5
Triple-Gate Verification
Three automated quality gates run before any result reaches the user. Catch rate: 15.2%–52% depending on domain. Gates are logged, deterministic, auditable.
L6
Transparency Card
Structured JSON document with the complete decision chain: candidates, bids, winner reasoning, sub-tasks, gate outcomes, cryptographic hash. SEC 17a-4 compliant.
L7
Emergent Learning
Feedback loop: agents that overbid confidence and fail are deprioritized. Agents that bid conservatively and succeed get reputation boosts. Self-corrects without human intervention.

Layer 1: Intent Engine — The Router That Listens

An intent is not a task. It's a business question expressed in natural language: "What is the compliance status of our Q2 investments against current ESG mandates?" or "Model tax-loss harvesting scenarios across three client portfolios."

The Intent Engine receives the intent, tags it with domain metadata (finance, compliance, risk, operations), and classifies it into one of 12 capability clusters based on semantic analysis. This classification is deterministic and logged—the same intent will always match the same cluster.

Why this matters for compliance: Every request is tagged and logged before any agent sees it. You can reconstruct what triggered the system, when it happened, and which domain it was routed to. This is the first element of an auditable decision trail.

Layer 2: Semantic KNN Router — Finding the Right Specialist in 2–5ms

After intent classification, the system queries a vector database of 201 specialist agents. Using K-nearest-neighbors similarity matching, it identifies 8–12 agents whose expertise best matches the intent's semantic meaning.

An intent about "regulatory reporting timelines" matches agents specialized in compliance, reporting, and risk—not portfolio optimization or trading. This filtering happens in 2–5ms using a pre-computed embeddings cache. No LLM calls. No latency.

Layer 3: Multi-Objective Auction — The Competition

Once the candidate agents are identified, the real orchestration begins: a competitive auction where agents submit proposals simultaneously. Each agent submits a bid with three components:

  1. Confidence: Agent's estimated probability of success (0–1)
  2. Cost: Predicted token consumption
  3. Reasoning: Structured explanation of approach (logged, auditable)

The system scores each bid using:

score = (confidence × domain_relevance_multiplier) / execution_cost

The agent with the highest score wins the right to execute. All bids are logged—even losing bids, with their confidence, cost, and reasoning. This is Sturna's core differentiator: emergent orchestration without static routing logic. No DAGs. No human-written workflows. Agents self-organize through competition.

Layer 4: StarDAG Execution Engine — Parallel Execution

The winning agent executes its plan. But execution isn't linear. The StarDAG engine enables parallel sub-task execution when an agent's work can be split. A portfolio analysis might run ESG screening, tax impact modeling, and regulatory compliance checking simultaneously—not sequentially. Outputs are merged into a unified result.

Every sub-task execution is timestamped and attributed to the specific agent. If one parallel path fails, the system captures which one and why. End-to-end execution averages 21.1 seconds. P99 latency for the routing + bidding layer alone is 340ms—versus 850ms P99 for LangGraph, which uses LLM-based routing on every request.

Layer 5: Triple-Gate Verification — The Quality Gates

Before any result reaches the user, it passes three automated quality gates. Each gate inspects the result from a different angle:

Gate 1
Internal Consistency
Do all outputs reference the same source data? Are numerical calculations consistent? Do conclusions follow from evidence?
15.2%–35% catch rate
Gate 2
Failure Trap Detection
If one component fails, does the entire result collapse? Are there untested edge cases? Is the agent aware of what it doesn't know?
18%–28% catch rate
Gate 3
Boundary Coverage
Are all error conditions handled? Are domain boundaries respected? Are regulatory requirements met?
Up to 52% catch rate

The system is brutal. If a result fails any gate, it's returned with explicit reasoning: "Gate 2 detected: tax impact model fails when client has direct stock holdings. Recommend manual review before serving to client."

Why this matters for compliance: Triple-Gate is compliance infrastructure. For governance frameworks, 52% of second-pass revisions find gaps—the system is more rigorous than manual review.

Layer 6: Transparency Card — The Full Explanation

Every result includes a Transparency Card: a structured JSON document that shows the complete decision chain:

{
  "intent": "Model tax-loss harvesting for Q2",
  "intent_classification": "portfolio_optimization",
  "candidate_agents": [
    {
      "agent": "Financial Modeler",
      "confidence": 0.92,
      "bid_cost": 1847,
      "reasoning": "Specialized in tax-aware portfolio optimization",
      "won": true
    },
    {
      "agent": "Risk Optimizer",
      "confidence": 0.78,
      "bid_cost": 2104,
      "reasoning": "Risk-first approach suboptimal for tax planning",
      "won": false
    }
  ],
  "execution": {
    "winner": "Financial Modeler",
    "actual_cost": 1823,
    "execution_time_ms": 4127,
    "sub_tasks": [
      {"task": "ESG screening", "cost": 456, "status": "complete"},
      {"task": "Tax lot analysis", "cost": 892, "status": "complete"},
      {"task": "Scenario modeling", "cost": 475, "status": "complete"}
    ]
  },
  "quality_gates": {
    "gate_1_consistency": "passed",
    "gate_2_failure_traps": "passed",
    "gate_3_boundary_coverage": "passed"
  },
  "audit_trail_hash": "0x8a3f7c2b9e...",
  "timestamp": "2026-05-03T14:32:18Z",
  "requestor": "compliance_officer_id_4821"
}

This card is immutably logged and cryptographically signed. Every user action creates an auditable record. SEC 17a-4 requires immutable records — this card satisfies that requirement. SOC 2 requires audit trails — this card is the audit trail.

Layer 7: Emergent Learning — Self-Improvement

Every execution creates a record in the learning system. The system tracks confidence calibration (did confident agents succeed?), cost accuracy, and win/loss history per agent per domain. Over time, a feedback loop forms. An agent that consistently bids high confidence but fails will be deprioritized. An agent that bids conservatively but succeeds gets a reputation boost. Learning is transparent and logged—no black-box feedback.

Section 2: The Four Differentiators

1. Auditable Emergence

Traditional frameworks require humans to write routing logic, design workflows, and specify which agent handles which task. When something breaks, you debug human-written logic. When you add a new agent, you rewrite routing. The framework is static.

Sturna agents compete based on confidence + cost. Adding a new agent is as simple as registering it—it competes immediately. If it's good, it wins. If it's bad, it loses. This is emergence: decentralized decision-making within centralized governance. You define the rules; the system enforces them automatically.

2. Triple-Gate Verification

Most AI frameworks have one quality mechanism: hope that the model is good enough. Sturna has three. See the gate cards above for catch rates by gate type.

3. Cross-Domain Intelligence — 201 Specialist Agents

Sturna's agent pool spans five tiers: Governance (Compliance Audit, Cost Attribution, Audit Trail, SLA Enforcer, MCP Governance), Risk/Ops (Chaos Engineer, Conduit DevOps, Phantom Security), Enablement (Onboarding Wizard, Intent Debugger, Agent Benchmarker), Specialized (InsForge Engineer, Financial Modeler, Cross-Agent Mediator, Policy Enforcer), and Maintenance (Health Monitor, Versioning Agent, Marketplace Curator).

4. Institutional Observability

Every execution produces a Transparency Card. Sturna provides dashboards to aggregate, search, and audit these cards: all decisions by date/agent/domain/cost, audit trail hash chain (tamper-evident), cost attribution, agent confidence calibration over time, quality gate pass/fail rates, and role-based approvals with timestamps.

Section 3: Compliance Architecture

SEC 17a-4 Alignment: Immutable Audit Trail

SEC Rule 17a-4(f) requires "electronic records must be retained in a non-rewritable, non-erasable format" and "must be alterable only by addition of new data." Sturna satisfies this with:

  1. Immutable Event Log: Events are appended only; no updates or deletes.
  2. Tamper-Evident Hashing: Each event includes a cryptographic hash of the previous event. If anyone modifies a record, the hash breaks.
  3. Timestamping: Every record is timestamped and verifiable against trusted time authority.
  4. Retention Compliance: All records retained for required periods (7 years for finance).

SOC 2 Type II Alignment

ControlImplementation
Role-Based AccessCompliance officers see compliance records; traders see trade records; auditors see everything
Encryption at RestAES-256-GCM for all stored Transparency Cards
Encryption in TransitTLS 1.3 for all API traffic
Audit LoggingEvery access to a Transparency Card is logged (who, when, what)
Incident ResponseAutomated rollback (30-min SLA): revert any decision within 30 minutes

Compliance Framework Alignment

StandardSturna FeatureStatus
SEC 17a-4Immutable audit trail + hash chain✓ Compliant
SOC 2 Type IIRBAC, encryption, audit logging✓ Auditable
EU AI Act Art. 14Human-in-loop for Severity 1 decisions✓ Built-in
GDPR Art. 22Appeal mechanism + 7-year retention✓ Compliant
NIST AI RMFHallucination detection, bias disparity monitoring✓ Implemented

Tenant Isolation & Encryption

Each tenant's intents, executions, and Transparency Cards are isolated at the database level. Each tenant has its own encryption key. Role-based visibility ensures a trader at Firm A cannot see Firm B's audit trail, even if both use Sturna.

Section 4: Benchmark Data

Latency Performance

MetricSturnaLangGraphSpeedup
P50 latency340ms550ms1.6×
P99 latency340ms850ms2.5×
Full execution21.1s32.5s1.5×

Sturna's latency is constant (no percentile tail blowups) because intent routing uses pre-computed embeddings (2–5ms), auction scoring is deterministic (3–8ms), and there are no LLM-based routing calls on every request.

Token Efficiency

ScenarioBaselineSturnaSavings
Routine tasks2,847 tokens971 tokens66.1%
Complex analysis8,234 tokens6,125 tokens25.6%
Overall average24.7%

Cost Per Intent

FrameworkCost per Intent
Sturna$0.0108
LangGraph$0.0180
Competitor A$0.0195
Sturna Savings40% cheaper

Reliability & Recovery

MetricRate
First-pass success94.2%
Recovery success (with self-healing)86%
Combined success99.4%

When an agent fails, Sturna's self-healing system detects the failure (triple-gate catches it), logs it (immutable record), re-routes to the second-best agent (next auction), executes an alternate approach, and logs the recovery. Users see the successful result with full provenance—never the failure.

Section 5: Competitive Position

vs. LangGraph (Enterprise Leader)

DimensionLangGraphSturna
P99 Latency850ms340ms (2.5×)
Cost per intent$0.0180$0.0108 (40% cheaper)
Audit trailNoneFull SEC 17a-4
Self-healingManualAutomatic
ConfigurationDAG authoring requiredZero config
Compliance-readyNoYes

LangGraph offers flexibility; Sturna enforces best practices. If your team likes writing orchestration code, LangGraph is better. If you want to delegate routing to the system, Sturna wins.

vs. CrewAI (Open Source Leader)

CrewAI has 44K GitHub stars, is free, and is simple. But it has no recovery mechanism, no compliance trail, and maxes out at ~500 agents. Sturna handles orchestration automatically, ships production-grade audit infrastructure, and scales to 1,000+ agents. Sturna costs $49/month; CrewAI is free. But Sturna ships reliable, auditable systems while CrewAI requires you to write orchestration code.

vs. AutoGen

AutoGen was deprecated October 2025. Sturna is the natural upgrade path.

vs. OpenAI Swarm (Minimalist Approach)

Swarm works with OpenAI models only and requires human-specified handoffs. It's practical for fewer than 5 agents with fixed handoffs. For multi-agent orchestration at scale with compliance requirements, Swarm is insufficient.

Market Opportunity

72% of Global 2000 organizations are deploying multi-agent systems (2025). The orchestration platform TAM is $8.2B over 3 years. Sturna's focus on compliance + observability positions it for the governance lane—the highest-margin segment.

Section 6: Conclusion & Call to Action

Regulated institutions—banks, wealth managers, insurance companies, healthcare systems—cannot deploy black-box AI at scale. Compliance, audit, and governance require transparency.

Sturna solves this through architecture, not bolted-on monitoring. Transparency is built in. Auditability is built in. Compliance is built in:

May 2026 Launch — Enterprise Pilot Program. We're looking for 10–15 institutional partners (RIAs, family offices, boutique asset managers) for early adoption. Cost: $49/mo + $0.0108 per intent average. For a typical RIA processing 5,000 intents/month: ~$103/month total.

To discuss Sturna for your institution, contact hello@sturna.ai. Include: your institution type, approximate intents/month, key compliance requirements, and current AI orchestration pain points. We'll schedule a technical overview and compliance architecture walkthrough.

Appendix: Technical Reference

Triple-Gate Catch Rates by Domain

DomainGate 1Gate 2Gate 3Combined
Email copy15.2%8.3%3.1%25.2%
Governance framework22.4%18.7%52.0%64.3%
GTM strategy11.2%9.1%12.7%28.4%
Tax planning18.9%14.2%7.3%36.0%
Risk modeling20.1%22.4%14.3%48.2%

Agent Tiers & Specialization

Governance Tier (5 agents): Compliance Audit, Cost Attribution, Audit Trail, SLA Enforcer, MCP Governance

Risk/Operations Tier (3 agents): Chaos Engineer, Conduit DevOps, Phantom Security

Enablement Tier (3 agents): Onboarding Wizard, Intent Debugger, Agent Benchmarker

Specialized Tier (8+ agents): InsForge Engineer, Financial Modeler, Cross-Agent Mediator, Policy Enforcer, Schema Migration, Cost Optimizer, Siphon Crawler, Artery Pipeline

Maintenance Tier (3 agents): Health Monitor, Versioning Agent, Marketplace Curator

Plus: 180+ support agents spanning social media, sales, content, research, and specialized finance domains.

Document Version: 1.0  ·  Date: May 3, 2026  ·  Classification: Public  ·  sturna.ai/how-it-works

Ready to Deploy Compliant AI?

Join the enterprise pilot program launching May 2026. Purpose-built for regulated institutions that can't afford black-box AI.

Start Free Trial Contact Enterprise Team →