01 AI Consulting 02 Software Development 03 About 04 Blog
DE EN
Arrange a call
All posts

AI & Law

The Claude Myth: What Really Lies Behind Anthropic's Safety Narrative

Hardly any AI provider curates its image as carefully as Anthropic. “Safety first,” a “constitution” for the model, a corporate form committed to the mission — it sounds like exactly what a legally minded company is looking for. Three of those four promises hold up under scrutiny; the fourth — “the safest model” — is a value judgment that has no business in any warranty. And even among the three that do hold up, the catch is in the detail: the “constitution,” for instance, is real as a training procedure but is not a runtime rule set. As someone who guides AI projects from the dual perspective of lawyer and developer, I see exactly there the expensive confusion: anyone who equates “positioned as safety-oriented” with “legally safeguarded” buys a marketing feeling and writes it into a warranty. This article separates what can be verified from what is self-presentation — and draws the operational consequence for contract, architecture, and data protection in each case.

Anthropic: What Can Be Verified — and What Is Self-Presentation

Anthropic was founded in 2021 by former OpenAI employees, among them the siblings Dario and Daniela Amodei. The company is structured as a Public Benefit Corporation: the board is legally permitted to prioritize the mission over pure shareholder interests. That is not an advertising promise but a fact of corporate law — and it matters, because it creates a real commitment to goals beyond profit maximization.

But this is exactly where the clean separation that counts begins. The PBC structure establishes an orientation. It does not establish that the product, in use, is “the safest” or “the most factually accurate.” The step from “positioned as safety-oriented” to “therefore superior” is a value judgment, not evidence. Anyone comparing providers should not skip over this gap — least of all when a liability question ultimately hangs on it.

The same goes for the figures circulating through the business press: valuations approaching a trillion US dollars, reports of an imminent IPO. These are market narratives about a company’s expected future. They say nothing about a model’s suitability for your specific use case. A high valuation does not answer whether a model summarizes your documents correctly or processes your data in compliance with data protection law.

ClaimVerifiable?What it means for you
Public Benefit CorporationYes (a fact of corporate law)Mission orientation, not a product guarantee
”Constitutional AI”Yes, as a training procedureA training method, not a runtime filter, not a legal framework
”Safest model”No (a value judgment)Your own suitability check per use case is required
Billion-dollar valuation / IPOYes, as a market narrativeSays nothing about factual accuracy or GDPR compliance

”Constitutional AI” Is Not a Law — and That Is the Most Important Point

The term that triggers the most misunderstandings among lawyers is “Constitutional AI” (CAI). The word “constitution” suggests a binding set of rules that the model adheres to like a legal text. It is not that.

CAI was introduced in 2022 in an Anthropic paper (“Constitutional AI: Harmlessness from AI Feedback,” Bai et al.). It is a training procedure in two phases. In the first, the model critiques and revises its own answers against a list of natural-language principles. In the second — “RLAIF,” Reinforcement Learning from AI Feedback — the model itself rates pairs of answers, which produces a reward signal for further training. The clever part: instead of many human evaluations, a written list of principles plus AI-generated feedback suffices.

By Anthropic’s own account, this “constitution” draws on sources such as the UN Declaration of Human Rights, platform terms of service (Apple’s, for example), and safety rules from research (DeepMind’s “Sparrow”). What is decisive is what Anthropic itself says about it: constitutions are “not a panacea” and “neither finalized nor likely the best they can be.” The company explicitly does not try to encode any particular ideology.

In practice, this means the “constitution” is a metaphor for a training artifact, not a runtime rule set. It is not a guarantee, not a hard filtering system that checks every output against rules at runtime, and certainly not a legal document. Runtime safeguards do exist in production Claude — safety classifiers, system prompts, usage policies — but that is a separate layer and precisely not “the constitution.” Anyone who tells a client or a customer that the model “adheres to a constitution” is confusing a training method with a compliance guarantee. This precision is not academic — it determines which assurances you may even make in a contract or in a data protection impact assessment. The binding framework is set not by the “constitution” but by the law: the EU AI Act has prohibited certain AI practices since 2 February 2025, and the obligations for providers of general-purpose AI (GPAI) models have applied since 2 August 2025 — that is the rule set with legal consequences, not a training artifact.

Hallucination Is a Design Principle, Not a Malfunction

The most persistent myth is that a good model is fundamentally factually accurate and that errors are exceptions to be “trained away.” That is technically untenable.

Large language models are statistical next-token prediction models. They compute which next piece of text is probable given the training patterns — without an embodied world model, without a semantic model of reality behind it. Hallucinations follow from this mode of operation: fluent, convincingly phrased, but factually wrong or entirely fabricated outputs. They arise because the model reproduces the training distribution (and thus also widespread errors) and because its knowledge ends at a cutoff date.

This is not a temporary weakness of the current generation. Several research papers show that hallucinations have a statistical lower bound: for arbitrary facts they cannot be eliminated entirely, regardless of model size or amount of data. The rate can be reduced — for instance by having a model say “I don’t know” more often — but it cannot be brought to zero.

Statistical lower bound of AI hallucinations: error rate falls with better training but never reaches zero

Hallucinations can be suppressed, not eliminated. That is exactly why the review layer is not optional but a mandatory part of the architecture.

For a company with duties of care, this is the central consequence: an AI system without human oversight on legally or financially relevant results is a liability risk, not an efficiency measure. The realistic expectation is not “the model is always right” but “the model delivers a very good first draft that must be checked.” Exactly this review layer — who checks what, under which four-eyes principle — belongs in any serious AI concept; it is more than a mere best practice, mirroring the EU AI Act’s GPAI obligations in force since 2 August 2025, which expressly provide for human oversight. How to build such a review layer in a legally sound way is covered in the article on securing hallucinations and control-layer design.

Model Choice Instead of Model Hype

Anthropic offers its lineup in tiers — from a powerful frontier model with a large context window down to fast, inexpensive models for high volume (at the time of writing, Claude Fable 5 at the top, with the Opus/Sonnet/Haiku tiers below it). Which model names and versions are current changes constantly and should be checked against the official Anthropic reference before any decision — the actual, stable message is the tiering itself, not “take the strongest.”

Task typeSensible tier choiceWhy
Bulk classification, routingfast tier (e.g. Haiku)High volume, clear task — a frontier model would be waste
Standard drafts, summariesmid tier (e.g. Sonnet)Good quality at moderate cost and latency
Long contract analysis, deep reasoningfrontier tier (e.g. Opus)Large context and reasoning depth justify the price

The right question is never “What is the most powerful model?” but “Which model fits this task?” — measured by latency, cost, context requirements, and task type. A simple classification of support tickets does not need a frontier model; a deep analysis of long contracts benefits from a large context. The strongest is often the most expensive and slowest without being any better for the task at hand. Model choice is a trade-off, not a status question. Whether a proprietary or open foundation is the right one is addressed in the comparison open-source vs. proprietary LLMs.

A Concrete Risk Many Underestimate: Availability

How real this risk is can be worked through with a documented case — and the scenario serves as an object lesson regardless of its details. In June 2026, media reported that, following a government directive, Anthropic had to suspend access to two of its frontier models worldwide within hours, while its remaining models stayed available. I have documented the sequence of events and the legal background — cleanly separated into confirmed and unconfirmed — in a separate piece: the Fable 5 ban and what it means for businesses. For the operational lesson the specifics do not matter; it is enough that such a case is possible.

Because the mechanics generalize: a single administrative act — or simply a provider retiring a model — can shut down a globally used AI product overnight. Anyone who chains a business process firmly to exactly one model from a single provider is building a concentration risk. The sober conclusion: model and provider interchangeability (fallback options), an inventory of which AI model sits where in the company, and contracts that address outages. Resilience is part of the architecture, not a downstream contingency plan. How to avoid this concentration risk structurally is shown in the article on AI vendor lock-in.

What Companies Can Realistically Expect

In summary, without hype and without a hatchet job: Claude is a technically strong tool from a provider that takes safety more seriously than some of its competition — demonstrable in its structure and research, not in advertising promises. “Constitutional AI” is a smart training method, not a legal framework. Hallucinations are systemic and demand a human review layer everywhere that errors carry a cost. And model choice is an engineering trade-off, not a reach for the most expensive product.

Whoever deploys AI with this sobriety — the right task, the fitting model, a clean legal basis, and an honest control layer — extracts the real benefit without falling for the marketing. This sober assessment is the core of my AI consulting. And why this dual qualification of business lawyer and developer makes the difference in AI projects, you can read here.

This article provides general technical and legal context and does not replace individual legal advice.

FAQ

Is “Constitutional AI” a legally binding set of rules?

No. “Constitutional AI” is a training procedure in which the model critiques and revises its own answers against a list of natural-language principles. The “constitution” does not act as a runtime filter and is not a legal document. Writing it into a contract as a compliance guarantee confuses a training method with a warranty.

Does the Public Benefit Corporation structure make Claude the safest model?

It establishes a mission orientation, not product superiority. The board may prioritize the mission over pure shareholder interests — that is a fact of corporate law. Whether a model is “the safest” for your use case is settled only by your own suitability and risk assessment.

Can a stronger model eliminate hallucinations?

No. Research shows a statistical lower bound for hallucinations: they can be reduced through better training, RAG, and guardrails, but not brought to zero — regardless of model size. For legally or financially relevant outputs, a human review layer is therefore mandatory, not optional.

Which Claude model should companies choose?

It depends on the task, not on prestige. Bulk classification runs cheaply on a fast model like Haiku, standard drafts on a mid-tier one like Sonnet, deep contract analysis with a large context on a frontier model. The right question is “Which model fits the task?”, measured by latency, cost, context requirements, and task type.

What does a government-mandated model shutdown mean for my AI strategy?

It is an object lesson in availability risk: a single administrative act can take a model offline worldwide. Anyone who chains a process to exactly one model from one provider builds a concentration risk. The answer is fallback models, a model inventory, and contracts that govern outages.


Sources — as of 18.06.2026
Leon Lotz

Leon Lotz

Leon Lotz is a business lawyer and founder of MusketierSoftware. He combines legal depth with real software craft.