Anthropic Rolls Back Safety Protocols as It Waits to Find Out If It's Being Drafted by the Army

Let’s run through a hypothetical situation real quick: Let’s say you’re an AI company that has made your calling card safety, and you are negotiating the use of your technology with the military, which has threatened to punish your business if you don’t abandon your principles. You’d like to maintain your position as the safety-conscious company in the AI space, which has garnered you a significant amount of goodwill with the general public as you resist government pressure. Is now a good time to announce that you’re rolling back some of your safety protocols and tell the Pentagon that you’re cool with AI launching missiles in certain circumstances?

Anthropic seems to think it is. On Tuesday, the company announced that it was updating its Responsible Scaling Policy, a framework it first introduced in 2023 with the goal of mitigating catastrophic risks associated with AI systems. The company has held the policy up as a differentiator between it and its competitors, a promise that it puts safety first, even at the risk of potentially falling behind other frontier models that exercise less caution.

Previously, Anthropic’s RSP stated, “We will not train or deploy models capable of causing catastrophic harm unless we have implemented safety and security measures that will keep risks below acceptable levels.” Now, the company claims it’s not so sure that’s worth it if that means losing ground. “We felt that it wouldn’t actually help anyone for us to stop training AI models,” Jared Kaplan, Anthropic’s chief science officer, told TIME. “We didn’t really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

Anthropic does credit its original RSP for incentivizing it to develop stronger safeguards for its model, but has basically said that because other companies haven’t adopted similar restraints, it needs more flexibility that red lines don’t offer. “The Responsible Scaling Policy was always planned to be a living document: a policy that had the flexibility to change as AI models become more capable,” the company said in a blog post. Anthropic said it will continue to publish risk reports, but is going to run with “nonbinding but publicly-declared” safety goals rather than firm internal standards. A generous reading of that would be a commitment to public accountability. A less charitable read might be that the company knows there is no way for the public to actually enforce these standards, so why bother restraining itself?

Anthropic told the Wall Street Journal that the change to its RSP is unrelated to its ongoing negotiations with the Pentagon, which just yesterday gave the company an ultimatum to loosen its safety guardrails so that the military can use its AI models as it sees fit or face consequences. But it’s hard not to read the change in that light.

Anthropic has maintained two primary red lines as it relates to the use of its technology for military operations: it will not allow its models to be used for mass domestic surveillance or to develop fully autonomous weapons that would operate without human involvement. Defense Secretary Pete Hegseth seems unwilling to accept that, and threatened to cancel Anthropic’s government contracts, declare Anthropic a “supply chain risk,” and/or invoke the Defense Production Act to force the company to build a model for the military’s desired purposes.

But it appears the company has already been negotiating carveouts that don’t quite cross the red line. On Wednesday, Semafor reported the Pentagon asked Anthropic in December if it would allow its model to be used to autonomously launch missiles to shoot down other missiles. Reportedly, Anthropic said the Pentagon should reach out to ask before moving forward with such a use case—though Semafor reported that Anthropic was and continues to be willing to create a missile defense carveout for its policies.

It’s possible, maybe even likely, that Anthropic was always going to loosen the restrictions it has placed on itself. It’s also possible that change was always going to come this week, regardless of the standoff with the Defense Department over AI safeguards. But given the position Anthropic finds itself in, it does become difficult not to view the situation as the company starting to compromise on its principles.

Gizmodo reached out to Anthropic for more information, but the company did not offer comment prior to publication.

What's Hot

Krita’s free AI integration just made Adobe Firefly obsolete

Andy Weir Wrote a New ‘Project Hail Mary’ Adventure, Just Not in the Way You’d Expect

Garmin is working on a new muscle oxygen readiness score

This stylish Bluetooth speaker has four features you won’t find anywhere else

Spotify’s Prompted Playlists can help you find new podcasts to listen to

Anthropic’s New Model Is So Scarily Powerful It Won’t Be Released, Anthropic Says

Fitbit 4.66 rolls out water, food, & mood logging, free access

Pebblebee Halo is a Find Hub tracker, flashlight, and safety alarm

Anthropic signs deal with Google to power Claude

Krita’s free AI integration just made Adobe Firefly obsolete

Andy Weir Wrote a New ‘Project Hail Mary’ Adventure, Just Not in the Way You’d Expect

Garmin is working on a new muscle oxygen readiness score

Krita’s free AI integration just made Adobe Firefly obsolete

Andy Weir Wrote a New ‘Project Hail Mary’ Adventure, Just Not in the Way You’d Expect

Garmin is working on a new muscle oxygen readiness score

Usefull link

categories

What's Hot

Anthropic Rolls Back Safety Protocols as It Waits to Find Out If It’s Being Drafted by the Army

Related Posts

Usefull link

categories