(CSA Images/Getty Images)

Anthropic’s new Claude AI can control your computer, and sometimes it just does whatever it wants to

The company is defending its choice to release the tool to the public before fully understanding how it could be misused.

Jon Keegan

10/22/24 4:28PM

Today generative-AI company Anthropic released an upgraded version of its Claude 3.5 Sonnet model, alongside a new model, Claude 3.5 Haiku.

The surprising new feature of Sonnet is the ability to control your computer — taking and reading screenshots, moving your mouse, clicking on buttons in web pages and typing text. The company is rolling this out as a “public beta” release and admits it is experimental and “at times cumbersome and error-prone,” according to the post announcing the new release.

In a blog post discussing the reasons for developing the feature and what safeguards the company is putting in place, Anthropic said:

“A vast amount of modern work happens via computers. Enabling AIs to interact directly with computer software in the same way people do will unlock a huge range of applications that simply aren’t possible for the current generation of AI assistants.”

Last week Anthropic’s CEO and cofounder Dario Amodei published a 14,000-word optimistic manifesto on how powerful AI might solve many of the world’s problems by rapidly accelerating scientific discovery, eliminating most diseases, and enabling world peace.

The ability for computers to control themselves is hardly new, but the way Sonnet is implemented is novel. A common example of automated computer control today might be a programmer writing code to control a web browser to scrape content. But Sonnet does not require any code, and lets the user open the windows of apps or web pages, then write instructions for what the AI agent should do, and the agent analyzes the screen and figures out what elements to interact with to execute the user’s instructions.

If the idea of releasing an experimental AI agent loose on an internet-connected computer sounds like a dangerous idea, Anthropic kind of agrees with you. The company said, “For safety reasons we did not allow the model to access the internet during training,” but the beta version allows the agent to access the internet.

Anthropic recently updated its “Responsible Scaling Policy,” which lays out specific thresholds of risks and determines how the tools are released and tested. According to this framework, Anthropic said they found that the upgraded Sonnet gets a self-assigned grade of “AI Safety Level 2,” which it describes as showing “early signs of dangerous capabilities,” but is safe enough to release to the public.

The company is defending its choice to release such a tool to the public before fully understanding how it could be misused, saying they would rather find out what kinds of bad things might happen at this stage, rather than when the model has more dangerous capabilities. “We can begin grappling with any safety issues before the stakes are too high, rather than adding computer use capabilities for the first time into a model with much more serious risks,” the company wrote.

The potential for the misuse of consumer-focused AI tools like Claude is not merely hypothetical. Recently OpenAI released a list of 20 incidents in which state-connected bad actors had used ChatGPT to plan cyberattacks, probe vulnerable infrastructure, and design influence campaigns. And with the US presidential election just two weeks away, the company is aware of the potential for abuse.

“Given the upcoming US elections, we’re on high alert for attempted misuses that could be perceived as undermining public trust in electoral processes,” the company wrote. In the GitHub repository with demo code, the company cautions users that Claude’s computer-use feature “poses unique risks that are distinct from standard API features or chat interfaces. These risks are heightened when using computer use to interact with the internet.” Anthropic also warned, “In some circumstances, Claude will follow commands found in content even if it conflicts with the user’s instructions.”

To protect against any election-related meddling via the use of Sonnet’s new capabilities, Anthropic said they have “put in place measures to monitor when Claude is asked to engage in election-related activity, as well as systems for nudging Claude away from activities like generating and posting content on social media, registering web domains, or interacting with government websites.”

Anthropic said it will not use any computer screenshots observed while using the tool for any future model training. But the new technology’s behavior appears to still surprise its own creators with “amusing” behavior. Anthropic said that at one point in testing, Claude was able to stop the screen recording, losing all the footage. In a post on X, Anthropic shared footage of Claude’s unexpected behavior, writing “Later, Claude took a break from our coding demo and began to peruse photos of Yellowstone National Park.”

Even while recording these demos, we encountered some amusing moments. In one, Claude accidentally stopped a long-running screen recording, causing all footage to be lost.

Later, Claude took a break from our coding demo and began to peruse photos of Yellowstone National Park. pic.twitter.com/r6Lrx6XPxZ
— Anthropic (@AnthropicAI) October 22, 2024

Chris Stokel-Walker

Bot bias

6/17/26

Companies are getting AI chatbots to smear their competitors

The race to influence AI chatbots is leading to some companies to adopt shady competitive tactics.

Tom Jones6/17/26

Prediction markets have, predictably, been given a boost by the summer of sports

Major platforms like Kalshi and Polymarket have seen huge upticks in users of late, thanks in no small part to what’s felt like a recent sporting smorgasbord, with major competitions across hockey, basketball, and soccer soaking up fans’ time (and spending, clearly) at the outset of summer.

While gaming industry groups may not like it, there’s been a huge change in the methods people are using to put money on the big games, with everyone from fortunate NYC bar owners, to a far less fortunate Spanish supporter, turning to prediction markets to try and turn their sports know-how into cold, hard cash.

According to a new report from Adam Blacker for apptopia, that shift might have been even more seismic than imagined in the wake of the NBA and NHL finals and around the 2026 World Cup kicking off.

2026 World Cup Reverses Seasonal Lull for Sports Betting Apps

According to a new report from Adam Blacker for apptopia, that shift might have been even more seismic than imagined in the wake of the NBA and NHL finals and around the 2026 World Cup kicking off.

South by Southwest Conference and Festivals

Gold Tesla Cybercabs are piling up, but they’re not picking up passengers yet

Low-volume production started in April. Now people are noticing them more and more in the wild.

Rani Molla6/15/26

Millie Giles

TAKING ACCOUNTS

6/15/26

Britain announces social media ban for under-16s starting early 2027

The UK government plans to use the same model for the restrictions as Australia — but how successful has that case study been so far?

UK Prime Minister Announces Under-16s Social Media Ban

Jon Keegan6/15/26

Anthropic pulls Fable and Mythos access worldwide after Trump administration bars their use by foreign nationals

Only days after releasing two versions of its next-gen AI model, Anthropic has disabled them for users worldwide.

Anthropic says it received a Friday night order from the Trump administration to suspend access to the models for any foreign national (anywhere in the world) — a group that included some Anthropic employees. In response, the company turned off access to everyone.

Last week, the company released to the public its much-anticipated Claude Fable 5 model (and its restricted version Claude Mythos 5, which is still being tested with trusted partners). Anthropic said in a blog post announcing the action that officials cited national security concerns with the new models, while offering few specific details.

The post said that the government gave the company “verbal evidence of a potential narrow, non-universal jailbreak” of the public Fable 5 model. A jailbreak is a means by which users can evade restrictions built into the code to unlock prohibited functionality. Anthropic downplayed the significance of the attack, and said other major models, such as OpenAI’s GPT-5.5, could also be affected by the technique described.

Fears of these first Mythos-class models being misused are running high, after Anthropic warned the cybersecurity world in May that the advanced cyber capabilities of Mythos have rapidly discovered thousands of vulnerabilities in ubiquitous software, leading to the decision to restrict the full version of the model to a close group of trusted partners for testing.

This morning, Axios reported that Anthropic technical staff have flown to Washington to meet with White House officials to resolve the issue.

The Wall Street Journal is reporting that the Trump administration’s decision to take action against Anthropic was prompted by discussions that Amazon CEO Andy Jassy had with officials, including Treasury Secretary Scott Bessent. According to the report, Amazon researchers said they had been able to evade some of Fable 5’s security restrictions using specific prompts. Amazon is a major investor in Anthropic.

Anthropic is currently suing the US government to fight the Pentagon’s blacklisting of the company on national security grounds.

Statement on the US government directive to suspend access to Fable 5 and Mythos 5