Miles Brundage launches AVERI, an independent AI auditing institute focused on frontier model safety
Jan 28, 2026 with Miles Brundage
Key Points
- Miles Brundage launches AVERI, a nonprofit auditing institute focused on frontier AI safety, arguing the industry lacks rigorous standards equivalent to cybersecurity infrastructure.
- AVERI categorizes AI risk into four domains: unintended behaviors, direct misuse including confirmed cyberattacks, emergent social harms, and conventional security vulnerabilities.
- Brundage positions auditing as driven by private-sector incentives from insurers and AI companies rather than regulation alone, with AVERI accepting API credits but no cash from frontier model developers.
Summary
Miles Brundage has launched AVERI (AI Verification and Evaluation Research Institute), a nonprofit think tank focused on independent auditing of frontier AI systems. AI is becoming critical infrastructure, yet the industry lacks the rigorous safety and security standards applied to other technologies. There is no equivalent to the cybersecurity industry that emerged to secure the internet.
AVERI breaks AI risk into four categories. Unintended system behaviors—hallucinations, misalignment, deception—occur when AI systems act contrary to user intent. Direct misuse happens when attackers use models like Claude to conduct cyberattacks, which Brundage confirms has occurred with Chinese government-connected actors. Emergent social phenomena include psychosis, addiction, and degraded learning emerging from human-AI interaction. Conventional security covers tampering, IP theft, and unauthorized access.
The organization positions itself as a hub for multiple stakeholders: policymakers, AI companies, enterprise customers, investors, and insurers. Private-sector incentives, particularly from insurers, create stronger pressure for high-quality auditing than regulation alone. AVERI has received API credits from six frontier AI developers for testing purposes but avoids taking cash from them. One donor is an AI underwriting company, aligning incentives with accurate risk assessment.
Brundage notes that AVERI staff hold varied views on how quickly AI risk will manifest. He personally leans toward "AI is moving very quickly and we could see crazy stuff very soon," but argues this belief is not prerequisite for supporting auditing. Even treating AI as a normal technology—like a power bank audited for electrical safety—justifies rigorous auditing. The case strengthens if you believe in tail risks.
One specific emerging problem is prompt decomposition. Attackers break down requests for malware or bioweapons into benign subcomponents across multiple accounts or even multiple models, evading simple refusal mechanisms. Detecting such coordinated misuse while preserving privacy is difficult. Models older than a year should be assumed maximally misused and monitored accordingly. Focus should remain on frontier systems where real-time detection is feasible.
Auditing standards are shifting from inability arguments—showing a model is too weak to enable harm—toward affirmative cases demonstrating that mitigations actually work. Mitigations operate at three levels: model-level refusal of harmful outputs, system-level classifiers blocking certain outputs, and platform-level detection of coordinated fraudulent accounts. Organizations like METER, Apollo Research, and SecureBio are building this capability piecemeal, but work remains largely voluntary and narrow in scope.
Brundage focuses on establishing clear safety and security standards (California and New York are starting this), generating evidence that companies follow those standards through auditing and transparency cards, and creating incentives for compliance without crushing small businesses. He views energy costs as a real but somewhat overblown policy focus relative to these core risks.