The Alarm Already Went Off

March 30, 2026

When Anthropic's most powerful model leaked last week, my first reaction wasn't excitement or fear. It was practical curiosity: how much better is it, and does that change what I can build next week?

I build with Claude every day. It writes code, reasons through architecture decisions, manages files across my system. So when I saw the Mythos leak, I wasn't processing a headline. I was recalculating. Most people I know either ignored the news entirely or shared it without reading past the first paragraph. I get both reactions. But they're missing what actually matters here, and it isn't the model itself.

what actually dropped

On March 26, a configuration error in Anthropic's content management system exposed roughly 3,000 unpublished assets, including draft blog posts describing a new model called Claude Mythos. Anthropic confirmed the leak to Fortune, calling Mythos "a step change" in capabilities and "the most capable we've built to date."

The leaked materials describe Mythos as the first model in a new tier called Capybara, sitting above Opus, which until now was Anthropic's most powerful offering. Anthropic confirmed that a small group of customers has early access, and that the rollout is deliberately cautious.

That's what we know. No benchmark tables, no architecture details, no launch date. Just the company that built it telling us it represents a step change, and then slowing down its own release because of what it can do.

The slowdown is the story.

the safety system that already activated

Every few months, an AI company announces a new model and calls it a breakthrough. Most of the time, the improvements are real but incremental. You notice them if you use these tools daily. You don't if you check in once a year.

This one stands out, and not because of the marketing language. It's about what Anthropic had already done before the leak.

Anthropic has something called a Responsible Scaling Policy: a framework with defined safety levels (ASL-1 through ASL-4+) and capability thresholds that can force the company to pause its own work. When a model hits certain benchmarks in areas like biological weapons development or autonomous behavior, the policy requires upgraded safeguards before deployment. If the safeguards aren't ready, training stops.

When Anthropic released Claude Opus 4 last year, something shifted. For every previous model, they could confidently classify it below their threshold for catastrophic risk. With Opus 4, they couldn't. Their evaluation scores on tests like the Virology Capabilities Test had been "steadily increasing over time," and they could no longer rule out that the model crossed their threshold. So they activated ASL-3 protections as a precaution, deploying under stricter security constraints than anything they'd released before.

You can argue that Anthropic is unusually cautious, that their thresholds fire earlier than other labs'. But even that is information: the cautious lab is now unsure its own models stay below catastrophic-risk thresholds.

Mythos sits above Opus 4. The leaked draft and Anthropic's confirmation describe it as a step change beyond the model that already triggered their safety protocols. We don't have benchmark tables to know exactly how much more capable it is, and "step change" could mean many things. But the sequence matters: safety system built, safety system activated for Opus 4, and now a model described as meaningfully beyond that.

Anthropic isn't the only company at the frontier. OpenAI, Google, and others are on similar trajectories with different disclosure styles. But Anthropic is the one that built a formal, public mechanism to constrain itself, and that mechanism has already engaged. When the builders hit their own brakes, it's worth understanding why.

what the CEO is saying about where this goes

Dario Amodei, Anthropic's CEO, has been unusually direct. In a February 2026 interview with Dwarkesh Patel, he said he puts 90% probability on reaching what he calls "a country of geniuses in a data center" within ten years: systems smarter than any Nobel Prize winner, running millions of instances, thinking 10 to 100 times faster than humans. He said the public's lack of recognition of how close we are is "absolutely wild."

In his essay "The Adolescence of Technology," he went further. AI is already writing much of the code at Anthropic, he wrote, and this feedback loop "may be only 1-2 years away from a point where the current generation of AI autonomously builds the next." He described some of the strongest engineers he's ever met "handing over almost all their coding to AI." Three years ago, these models could barely write a single line of code.

Amodei has mixed incentives here. Talking about safety and risk also positions Anthropic as the responsible company, which helps with regulators, enterprise customers, and recruiting. But even accounting for that, his prediction to Axios last May was striking: AI could eliminate up to half of entry-level white-collar positions, pushing unemployment to 10-20% within one to five years. He acknowledged in the same interview that cancer could be cured and the economy could grow at 10% annually. Both outcomes, simultaneously. That's not hype. That's a specific, falsifiable prediction from someone who would know.

what's actually changing right now

There's an independent research organization called METR that measures how long AI models can work autonomously on real-world tasks, calibrated against how long those tasks take a human expert. They've tracked this for six years.

In 2019, the answer was seconds. By early 2025, Claude 3.7 Sonnet could handle tasks that take a human expert about 55 minutes. By late 2025, leading models hit four to five hours. The most recent measurement, from early 2026, puts the frontier at roughly 14.5 hours. A full workday.

The doubling time on this metric is approximately 3 to 7 months. If that trend roughly holds, and it has for six years, you'd expect multi-day autonomous work within a year or two. That's a projection, not a certainty. But it's grounded in more data than most predictions in this space.

Those are lab-style measurements. On the ground, the same thing shows up more mundanely.

I built a complete macOS application: native SwiftUI interface, Supabase backend, real-time device coordination, the full thing. I have no formal programming training. I'm an economics major who learned to build software by working with AI. Claude didn't just help me write code; it architected systems I wouldn't have known how to design. Two years ago, that kind of project would have required a team of experienced developers. Now a single person with the right tools can ship it.

The evidence from inside companies tells a similar story, though it's earlier and messier than the predictions suggest. Goldman Sachs found that firms discussing AI's impact on jobs cut their job openings by 12% in 2025, compared with 8% for all companies. McKinsey reports that 88% of organizations now use AI in at least one business function. A BCG survey found that employees at companies deeply restructuring workflows around AI are significantly more worried about job security (46%) than those at companies using it superficially (34%).

These are early signals, not a macro collapse. Brookings notes that large-scale net displacement hasn't hit aggregate data yet, even as tasks and roles are shifting underneath. Companies are reshaping work, quietly pulling back on hiring in exposed roles, and the people closest to the deployments are the most worried. The trajectory is clear. The timeline is not a straight line from capability to consequence.

Models still make confident mistakes. Deploying AI inside a company with compliance requirements, legacy systems, and risk-averse leadership is slow, expensive work. The gap between what a model can do in a demo and what an organization can actually deploy at scale is real, and it buys time. The question is whether people use that time to adapt or to pretend the gap will last forever.

what this means for you

I'm not going to tell you to learn to code or to panic about your job. Both of those recommendations are lazy.

What I will say is this: the single biggest information asymmetry in the economy right now is between people who use these tools seriously and people who don't. Not people who have heard of ChatGPT versus people who haven't. Everyone has heard of it. The gap is between people who pushed the current models into real work and people who tried the free version in 2023 and decided it wasn't impressive.

The models available today are unrecognizable from what existed a year ago. The people who know that, know it viscerally, because they've watched the tools they use daily get dramatically better every few months. Everyone else is evaluating based on an experience that no longer applies.

What closing that gap looks like depends on your work. If you write for a living, try drafting with the current models and see how much of your process changes. If you analyze data, feed it a real dataset and a real question. If you manage people, use it to synthesize the reports you never have time to read. The point isn't to replace what you do. It's to understand, from direct experience, what these tools are actually capable of right now, so you can make informed decisions about what comes next.

The people building the most powerful AI in history created a formal system to slow themselves down. That system activated. And the model that triggered it is not the one that leaked last week. The one that leaked is what comes after.