# OpenAI and Paradigm Introduce 'EVMbench' for AI Agent Benchmarking *Author: Jack Inabinet* *Published: Feb 18, 2026* *Source: https://www.bankless.com/read/news/openai-and-paradigm-introduce-evmbench-for-ai-agent-benchmarking* --- OpenAI and Paradigm today [introduced](https://openai.com/index/introducing-evmbench/) EVMbench, a benchmark evaluation that measures how AI agents detect, patch, and exploit high-severity Ethereum Virtual Machine (EVM) smart contract vulnerabilities. ### What's the Scoop? - **New Benchmark:** EVMbench draws on 120 curated vulnerabilities from 40 audits (most sourced from open code audit competitions) and includes several vulnerability scenarios inspired by the security auditing process for the Paradigm-backed Tempo blockchain. - **Numerical Score: **EVMbench assigns agents with a percentage-based performance score that is intended to encapsulate how well they can audit smart contracts, patch vulnerabilities while preserving functionality, and exploit vulnerable contracts. - **Limitations: **While the vulnerabilities tested by EVMbench are realistic and high-severity, the benchmark's developer disclaims that the text, "does not represent the full difficulty of real-world smart contract security." ### What's the Take? EVMbench supplies the crypto industry with a standardized way to measure how well AI can reason about real-world smart contract risk. While no test is perfect, this one establishes a measurable baseline that can be used to objectively evaluate emerging crypto-enabled AI agents. > Introducing EVMbench—a new benchmark that measures how well AI agents can detect, exploit, and patch high-severity smart contract vulnerabilities. [https://t.co/op5zufgAGH](https://t.co/op5zufgAGH)— OpenAI (@OpenAI) [February 18, 2026](https://twitter.com/OpenAI/status/2024193883748651102?ref_src=twsrc%5Etfw)