Anthropic’s Claude is getting a new constitution. On Wednesday, the company announced that the document, which provides a “detailed description of Anthropic’s vision for Claude’s values and behavior,” is getting a rewrite that will introduce broad principles that the company expects its chatbot to follow rather than the more stringent set of rules that it relied on in past iterations of the document.
Anthropic’s logic for the change seems sound enough. While specific rules create more reliable and predictable behavior from chatbots, it’s also limiting. “We think that in order to be good actors in the world, AI models like Claude need to understand why we want them to behave in certain ways, and we need to explain this to them rather than merely specify what we want them to do,” the company explained. “If we want models to exercise good judgment across a wide range of novel situations, they need to be able to generalize—to apply broad principles rather than mechanically following specific rules.”
Fair enough—though the overview of the new constitution does feel like it leaves a lot to be desired in terms of specifics. Anthropic’s four guiding principles for Claude include making sure its underlying models are “broadly safe,” “broadly ethical,” “compliant with Anthropic’s guidelines,” and “genuinely helpful.” Those are…well, broad principles. The company does say that much of the consitution is dedicated to explaining these principles, and it does offer some more detail (i.e., being ethical means “being honest, acting according to good values, and avoiding actions that are inappropriate, dangerous, or harmful”), but even that feels pretty generic.
The company also said that it dedicated a section of the constitution to Claude’s nature because of “our uncertainty about whether Claude might have some kind of consciousness or moral status (either now or in the future).” The company is apparently hoping that by defining this within its foundational documents, it can protect “Claude’s psychological security, sense of self, and well-being.”
The change to Claude’s constitution and seeming embrace of the idea that it may one day have an independent consciousness comes just a day after Anthropic CEO and founder Dario Amodeo spoke on a World Economic Forum panel titled “The Day After AGI” and suggested that AI will achieve “Nobel laureate” levels of skills across many fields by 2027.
This peeling back of the curtain as to how Claude works (or is supposed to work) is on Anthropic’s own terms. The last time we got to see what was happening back there, it came from a user who managed to prompt the chatbot to produce what it called a “soul document.” That document, which was revealed in December, was not an official training document, Anthropic told Gizmodo, but was an early iteration of the constitution that the company referred to internally as its “soul.” Anthropic also said its plan was always to publish the full constitution when it was ready.
Whether Claude is ready to operate without the bumpers up is a whole other question, but it seems we’re going to find out the answer one way or another.

