Written in a personal capacity. Views are my own, not those of any employer.

I’ve been thinking a lot about Claude Mythos and Project Glasswing this week. Anthropic announced a new AI model that is too dangerous to release publicly: it can autonomously discover and exploit high-severity, multi-step vulnerability chains in nearly every major operating system, browser, and other critical software, some of which had gone undetected for over 27 years. Over 50 organizations1 are now working to remediate what it found, and it’s a massive headache for security engineers across the industry. Jerome Powell and Treasury Secretary Scott Bessent met with the CEOs of most major banks this week to discuss the risks.

Mythos is dangerous, but it’s also the solution. Released into the wild, it would let adversaries compromise systems across computing infrastructure as rapidly and indiscriminately as COVID spread through the world in 2020, or worse. Applied carefully, in controlled doses, it works as a form of inoculation against itself.

The immunity analogy isn’t new. Consumers already call self-replicating malware a “virus,” and many strains are polymorphic, rewriting their own code on the fly to evade detection. Plenty of older machines still on the internet can’t speak modern encryption protocols at all, making HTTPS and TLS impossible. Zero-days turn up constantly in every system we depend on, and holding the line takes a steady stream of patches at every level of the software and firmware stack. Miss those updates and a machine quickly falls to known exploits, many of them public and trivial to weaponize.

The risk used to be bounded. Businesses, hospitals, and governments ran on paper and filing cabinets; a desktop computer was a novelty in many offices. Today a serious breach can take down a nation’s power grid or even launch weapons. A single exploited vulnerability could take control of a commercial airliner in flight, or a fleet of self-driving cars. The peaceful transfer of power in organizations and governments often amounts to little more than the peaceful transfer of user credentials, or of nuclear launch codes. Nearly every significant financial transaction, in traditional banking and in cryptocurrency, happens digitally; it isn’t hard to imagine one nation emptying another’s treasury by hacking. Food systems, energy systems, financial systems, transportation systems: all of it rides on a backbone of highly imperfect, human-made computers.

All of this has me thinking of computing as something like an early biological system. I don’t mean self-replication; I mean competition and adaptation pressure: an advanced species has arrived among isolated protobacteria with limited defenses. Systems exposed to advanced AI in careful measure can develop a form of immunity, though today that is a manual, time-intensive effort run by humans. Meanwhile AI, itself nothing but a computer program, keeps advancing, while traditional software needs an endless drip of patches just to hold its defenses where they are.

Where does this go? What happens when a less responsible organization releases a system with these capabilities into the wild, without regard for the collateral damage? These systems cannot yet hurt us directly2 the way a predator can, but we rely on computers for survival at scale in ways we never have before. How do we handle an evolutionary arms race unfolding inside our own infrastructure? What are the long-term effects?

When I hit an overlap like this, I look for what the biological sciences already know. I’m not an infectious disease expert, so I’d genuinely love to hear from people in the medical and life sciences communities.

Take vaccination. To eradicate a disease you typically don’t need to vaccinate every individual. Based on the parameters of the disease, mainly its transmissibility (R₀), you can calculate what percentage of the population has to be immune before the disease dies out on its own. That threshold is herd immunity.

As systems like Mythos inoculate software against the threats posed by advanced AI, I keep wondering whether concepts like herd immunity could tell us where to spend remediation effort. The mapping isn’t clean. Software doesn’t come in well-defined individuals the way animals do, and I’m not sure what the unit of “vaccination” even is: a library, a host, a protocol? But most of these attacks chain several vulnerabilities together, which changes the question. Which links appear in the most chains? Patch the choke points that many chains share and you may get the epidemiological effect: enough of the population is immune that an outbreak can’t sustain itself. Do we need to patch every bug? How would we know when we’ve done enough?

And Mythos will certainly not be the last model of its kind. Future models will be far more capable. We’re bound to find vulnerabilities buried even deeper, and perhaps someday flaws in the mathematics that underpins modern cryptography itself. Like biological evolution, the competition between machine and machine will only accelerate. I’m not sure we’re ready for that.


  1. 12 launch partners including AWS, Apple, Google, Microsoft, CrowdStrike, Cisco, JPMorgan Chase, and Palo Alto Networks, plus over 40 additional organizations that build or maintain critical software infrastructure. ↩︎

  2. I almost didn’t include this line because many weapons are nearly or fully autonomous, and also Waymo exists, which could definitely become a lethal weapon. ↩︎