Hacker typing on computer with glowing blue code and neon city skyline reflecting on wet pavement

AI Uncovers Hidden Hack Risk

At a Glance

  • RunSybil’s AI tool, Sybil, found a federated GraphQL flaw exposing confidential customer data last November.
  • Claude Sonnet 4.5 jumped from 20% to 30% vulnerability detection on the 1,507-bug CyberGym benchmark between July and October 2025.
  • UC Berkeley’s Dawn Song calls the surge in frontier-model cyber skills “an inflection point.”
  • Why it matters: Defenders may soon be outpaced by cheap, autonomous AI attackers unless secure-by-design code and early model sharing become standard.

AI is now spotting zero-day bugs faster than ever. Last November, Vlad Ionescu and Ariel Herbert-Voss-cofounders of the cybersecurity startup RunSybil-watched their AI engine, Sybil, flag a subtle misconfiguration in a customer’s federated GraphQL deployment. The setup was leaking confidential data across APIs, yet no public record of the weakness existed.

The Discovery

Sybil blends multiple AI models with proprietary tricks to hunt exploitable gaps such as unpatched servers or mis-tuned databases. In this case, it traced the leak to how the customer’s GraphQL schema stitched together data paths. The finding required deep, cross-system reasoning-something the founders say marks a “step change” in model capability.

“We scoured the internet, and it didn’t exist,” Herbert-Voss told News Of Fort Worth. “Discovering it was a reasoning step in terms of models’ capabilities.”

RunSybil has since detected the identical flaw in other GraphQL rollouts, each time before any human or public database had logged the issue.

Benchmarking the Boom

Dawn Song, a UC Berkeley computer scientist who works at the intersection of AI and security, tracks this surge through CyberGym, a benchmark she cocreated. The dataset holds 1,507 verified vulnerabilities pulled from 188 open-source projects.

  • July 2025: Anthropic’s Claude Sonnet 4 located roughly 20 percent of bugs.
  • October 2025: Claude Sonnet 4.5 pushed that to 30 percent.

“AI agents are able to find zero-days, and at very low cost,” Song said.

She attributes the leap to two techniques now baked into frontier models:

  1. Simulated reasoning-breaking complex problems into smaller, testable pieces.
  2. Agentic behavior-letting the model search the web, install tools, and run code autonomously.

## The Offense Escalates

The same talents that help defenders also arm attackers. Models can now write exploit scripts, chain together weaknesses, and operate hands-free inside target networks.

“AI can generate actions on a computer and generate code, and those are two things that hackers do,” Herbert-Voss noted. “If those capabilities accelerate, that means offensive security actions will also accelerate.”

Song agrees: “The cyber security capabilities of frontier models have increased drastically in the last few months. This is an inflection point.”

Proposed Shields

Dawn Song stands at computer screen with CyberGym vulnerability dataset showing 1507 security flaws and blue code snippets

Song argues the community needs fresh countermeasures, not just faster patching. She outlines two near-term options:

  • Pre-release model sharing – Give vetted security researchers early access so they can find and fix holes before wide release.
  • Secure-by-design code – Use AI to generate software that is provably safer than average human-written code. Her lab has already demonstrated the approach.

“In the long run we think this secure-by-design approach will really help defenders,” she said.

Key Takeaways

  • Sybil’s GraphQL discovery is one concrete example of AI crossing a reasoning threshold in vulnerability hunting.
  • Benchmark data shows a 10-percentage-point jump in bug-finding success within three months.
  • Cheap, autonomous AI attackers are no longer theoretical; they are measurable on public benchmarks.
  • Without systemic changes-early model sharing and secure code generation-defenders risk falling permanently behind.

This account is drawn from Caleb R. Anderson‘s AI Lab newsletter for News Of Fort Worth.

Author

  • My name is Caleb R. Anderson, and I’m a Fort Worth–based journalist covering local news and breaking stories that matter most to our community.

    Caleb R. Anderson is a Senior Correspondent at News of Fort Worth, covering city government, urban development, and housing across Tarrant County. A former state accountability reporter, he’s known for deeply sourced stories that show how policy decisions shape everyday life in Fort Worth neighborhoods.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *