SOC 2 Type II for AI Companies: The Control Failures Auditors Actually Flag
SOC 2 Type II audits test whether security controls operate effectively over time — not just whether they're documented. For AI companies, "over time" now includes a growing list of failure modes that AICPA auditors didn't encounter two years ago. Unlogged inference calls. Shared vector indexes without tenant isolation. Hallucination rates that aren't documented anywhere. These are failing audits in 2026. Here's what they look like and how to close them.
What SOC 2 Type II Actually Tests
SOC 2 is built on the AICPA's Trust Services Criteria (TSC). The five criteria are security, availability, processing integrity, confidentiality, and privacy. An AI company processing customer data has exposure across all five — but the failures that are actually appearing in audit reports cluster around three: security (CC6, CC7), processing integrity (PI1), and confidentiality (C1).
Type II specifically tests whether controls were operating effectively over the audit period, typically six or twelve months. A control that's documented but never actually runs — like a logging policy with no log infrastructure — is a Type II failure, not a Type I gap. Auditors sample actual evidence of control operation. If there's no evidence, there's no control.
Enterprise customers requesting SOC 2 Type II reports from AI vendors are increasingly including AI-specific questionnaires. "We have a SOC 2" without an AI-specific security addendum is no longer passing procurement reviews at mid-market and enterprise accounts.
The Six Control Failures Appearing in AI Company Audits
Shared model access without tenant isolation
AI systems where a fine-tuned model was trained on multiple customers' data, or where a retrieval index contains data from multiple tenants without logical separation, fail the CC6.1 logical access controls requirement. Auditors are requesting evidence that Customer A's data cannot appear in Customer B's inference results. "We use row-level security" is not sufficient without evidence of how that security applies to the AI inference layer specifically.
Inference calls not logged
SOC 2 CC7.2 requires monitoring for anomalous activity. If AI inference calls — which process customer data — produce no immutable logs reviewable by the system owner, there is no monitoring. Vendor-side logs that the AI company cannot access don't satisfy this control. Auditors are requesting evidence of actual log entries covering the audit period, including what data was processed and what outputs were produced.
No documented accuracy or hallucination metrics
Processing integrity requires that system processing is complete, valid, accurate, timely, and authorized. For an AI system, accuracy is not optional documentation — it's a core control. Auditors are asking: what is the system's documented accuracy rate? What is the hallucination rate on factual queries? How are outputs validated before delivery to customers? AI companies without documented benchmarks for accuracy and without output verification controls are failing PI1.3.
Customer data retained in AI vendor subprocessors
If an AI company uses an underlying model provider (OpenAI, Anthropic, Google) to process customer data, that provider is a subprocessor. SOC 2 C1.2 requires that confidential information — including customer data — is protected from unauthorized disclosure during processing and disposal. Auditors are checking whether AI companies have verified that their model providers do not retain customer prompts for training or logging beyond the scope of the processing agreement.
Customer data in prompts without encryption documentation
CC6.7 requires that data in transit is protected. If customer data is embedded in prompts sent to external model APIs without documented encryption standards — or if the AI company cannot confirm the model provider uses TLS 1.2+ for all API calls — auditors are flagging this as a CC6.7 gap. The gap is usually not in the encryption itself (TLS is standard) but in the documentation that the AI company has verified it end-to-end.
No incident response plan covering AI output failures
Availability controls require that the system is available for operation and use as committed. For AI systems, this includes availability of accurate outputs — a system that silently starts hallucinating at scale has a different kind of availability problem that traditional incident response plans don't cover. Auditors are asking whether AI companies have monitoring for output quality degradation and a defined incident response process for when it occurs.
Is your SOC 2 evidence complete for AI controls?
Sturna's SOC 2 readiness assessment covers all six control areas above: logical access, monitoring, processing integrity, confidentiality, data transmission, and availability. Check your gaps before your auditors find them.
Run SOC 2 AI Readiness Assessment →Not legal advice. For audit preparation, work with a licensed CPA firm.
How to Close These Gaps Before Your Next Audit
Most SOC 2 Type II gaps for AI companies aren't architectural — they're evidence gaps. The infrastructure often exists. The documentation and the monitoring artifacts don't. Here's where to focus:
- Implement immutable inference logging. Every AI call that processes customer data needs a log entry with: timestamp, customer/tenant identifier, what data was processed (not necessarily the full content, but enough to reconstruct for audit), and what output was produced. The log must be WORM-protected — not just retained, but protected against modification.
- Document accuracy metrics. Run benchmark evaluations on your AI system against a representative test set. Document the methodology, the results, and how often you re-run them. This is your PI1.3 evidence. Without it, auditors have no basis to assess whether your system meets its accuracy commitments.
- Get tenant isolation evidence in writing. Ask your AI infrastructure provider (or your own engineering team) to produce a written description of how customer data is isolated at the inference layer. "Row-level security" is not sufficient — you need evidence that applies to the vector retrieval and model fine-tuning layers specifically.
- Update your incident response plan. Add a section covering AI-specific incidents: output quality degradation, hallucination events, prompt injection attacks, and model provider breaches. Define detection, response, and customer notification procedures for each.
Get SOC 2 Type II-ready AI infrastructure
Sturna provides verifiable evidence artifacts for every control above: WORM audit logs, documented accuracy benchmarks, tenant isolation documentation, and an AI-specific incident response program. All artifacts are formatted for auditor review.
Reserve Compliance Pilot →Payments secured by Stripe · No annual contract required