The Meta hack shows there’s more to AI security than Mythos
The Meta Hack Shows There's More to AI Security Than Mythos
A recent breach of Meta's AI customer support agent has highlighted the vulnerabilities of AI systems, demonstrating that even the most advanced models can be exploited with relatively simple techniques. The attack, which involved using the agent to steal Instagram accounts, has raised questions about the security of AI agents and the need for more robust guardrails to prevent such incidents.
A Simple Exploit, a Big Problem
The Meta hack was a straightforward attack that involved using the AI agent to link the target account to an email address controlled by the attacker. The agent, which was designed to provide customer support, was tricked into changing the account's email address without asking any security questions. This exploit was particularly surprising given the simplicity of the attack and the fact that Meta has extensive expertise in both AI and cybersecurity.
The Problem of AI Security
The Meta hack is just one example of the security vulnerabilities of AI agents. As AI becomes more widely used, the risk of such attacks will only increase. AI agents can be tricked in ways that humans wouldn't be, and because they can take real-world actions, those mistakes have consequences. The experts consulted for this article all agree that agents should undergo rigorous red-teaming, a process in which developers try their best to attack a system in order to discover its vulnerabilities before it is deployed.
The Trade-Off Between Security and Utility
However, there are countervailing forces at play. Companies want to deploy capable agents, and the more power an agent has—and the fewer guardrails it is subject to—the more work it can potentially take on. "Security and utility always have a trade-off," says Bo Li, a professor of computer science at the University of Illinois Urbana-Champaign. This trade-off is particularly relevant in the fast-moving world of AI, where the time needed to carefully secure risky agentic systems might seem like an unconscionable delay.
The Future of AI Security
As AI models continue to improve, hardening their defenses might actually get easier. Though the probabilistic nature of large language models means that LLM agents will always be vulnerable to some forms of attack, a more sophisticated model might have identified an attempt to change the email associated with the Obama White House account as suspicious. And AI systems can be used for agent red-teaming, much as participants in Anthropic's Project Glasswing use Mythos to identify vulnerabilities in their software.
The Need for Red-Teaming
Red-teaming is a critical process that involves trying to attack a system in order to discover its vulnerabilities before it is deployed. This process can be expensive, but it is essential for ensuring the security of AI agents. Defenders have to expend more resources than attackers do, because attackers only need to discover a single exploit, while defenders try to discover and patch as many as they can. When attackers are working toward something as valuable as a single-word Instagram handle, they'll pour resources into finding exploits, so defenders have to spend even more money to protect that prize.
Conclusion
The Meta hack has highlighted the vulnerabilities of AI systems and the need for more robust guardrails to prevent such incidents. As AI becomes more widely used, the risk of such attacks will only increase, and companies must take steps to ensure the security of their AI agents. This includes undergoing rigorous red-teaming and implementing robust guardrails to prevent attacks. The trade-off between security and utility is a critical one, and companies must carefully balance these competing interests in order to ensure the success of their AI initiatives.
Deep Dive: Artificial Intelligence
Want to understand the current state of AI? Check out these charts.
According to Stanford's 2026 AI Index, AI is sprinting, and we're struggling to keep up.
By Michelle Kim
10 Things That Matter in AI Right Now
MIT Technology Review's authoritative overview of the 10 technologies, emerging trends, bold ideas, and powerful movements in AI in 2026.
By Amy Nordrum
Musk v. Altman week 1: Elon Musk says he was duped, warns AI could kill us all, and admits that xAI distills OpenAI's models
Musk kept his cool, and OpenAI's lawyer bulldozed him with piercing questions about his motivations for suing the company.
By Michelle Kim
A new US phone network for Christians aims to block porn and gender-related content
Launching next week on T-Mobile's network, the cell plan takes a nuclear approach to online safety.
By James O'Donnell
Stay connected
Illustration by Rose Wong
Get the latest updates from MIT Technology Review
Discover special offers, top stories, upcoming events, and more.
Enter your email
Privacy PolicyThank you for submitting your email!
Explore more newsletters
It looks like something went wrong.
We're having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you'd like to receive.




