EU vs. US LLMs: Data Sovereignty & the CLOUD Act (2025)

You have decided to put a large language model into production. The harder question only arrives now: Where does the model run, who can access your data — and which law governs if things go wrong? This is exactly where most mid-market AI projects fail — not on the technology, but on an unresolved sovereignty question. A data center in Frankfurt sounds safe. Whether it actually is depends on something quite different from the flag on the provider’s website.

This guide separates marketing from law. It shows why US models carry a real third-country risk, why an EU data center alone is no firewall, which European and local alternatives genuinely exist today — and the logic by which you should decide. Written from a dual perspective: the legal obligation first, then the technical implementation.

What does data-sovereign AI mean?

Data-sovereign AI means that you retain full control over the location, access to, and applicable law governing the data processing. Concretely: your data resides in a legal jurisdiction you trust, no third party can access it outside your control, and no foreign law can compel the provider to hand it over covertly. Sovereignty is therefore not a property of the model, but of the entire processing chain — location, operator, contract, and configuration combined.

That is the core idea of this article: Compliance arises from contract, configuration, location, and control over the provider — not from the brand name or the flag on the data center.

Why US LLMs are a sovereignty risk

US models such as GPT, Claude, or Gemini are technically excellent. Their problem is not quality, but the jurisdiction of their operator. Three levers interlock.

The CLOUD Act: access regardless of storage location

The U.S. CLOUD Act (2018) obligates US companies to hand over data on the order of US authorities — irrespective of where in the world the data is stored. A US corporation holding data in a German data center is subject to such an order just as it would be in the United States. The physical location of the hard drive therefore offers no protection; what matters is which jurisdiction the operator is subject to.

Third-country transfer and Schrems II

As soon as personal data flows to a US provider, Chapter V of the GDPR applies (Art. 44–50, transfers to third countries). In its Schrems II ruling (C-311/18, July 16, 2020), the CJEU struck down the then-applicable “Privacy Shield” because US surveillance laws (in particular FISA Section 702) offered no protection equivalent to that of the EU. Since then, every transfer to the US requires justification.

This is the legal heart of the matter. Art. 48 GDPR requires that a judgment or order from a third country’s court or authority may lead to disclosure of data only if it is based on an international mutual legal assistance treaty. A direct CLOUD Act order does not meet that condition. In its joint response with the EDPS (July 2019), the European Data Protection Board (EDPB) made clear that such disclosure is GDPR-compliant only in narrowly limited exceptional cases.

The result: a US provider that receives a CLOUD Act order faces a genuine dilemma — it either breaks US law or it breaks the GDPR. No contract and no EU data center can fully resolve this conflict.

Does an EU data center protect against the CLOUD Act?

No — an EU data center alone does not protect against the CLOUD Act. What matters is not where the data resides, but which jurisdiction the operator is subject to. If a US corporation (or its EU subsidiary controlled by the US parent) operates the data center in Frankfurt, it remains exposed to the CLOUD Act. The flag on the building changes nothing about that.

A real firewall arises only when the operator is subject exclusively to European law and is not controlled by a US parent company — that is, with a “sovereign cloud” in the strict sense, or by running everything on your own premises (on-premise).

Is the EU-US Data Privacy Framework a safe basis?

The EU-US Data Privacy Framework (DPF) has been in force since the European Commission’s adequacy decision of July 10, 2023. For certified US companies, it enables data transfers without additional safeguards. It is a valid but shaky legal basis — and it expressly does not resolve the CLOUD Act exposure.

As of December 2025, the situation is as follows: the action for annulment brought by Member of Parliament Philippe Latombe was dismissed by the General Court of the EU on September 3, 2025 — so the DPF remains in force for the time being. However, an appeal to the CJEU was lodged in late October 2025 (Case C-703/25 P), which is still pending. In parallel, many data protection advocates see a “Schrems III” on the horizon. An additional concern: the US oversight body PCLOB, which is meant to monitor compliance with the DPF, has effectively lost its quorum since several members were removed in early 2025.

Practical consequence: Anyone relying on the DPF as the sole basis for highly sensitive data is building on a construction with measurable residual and change-of-regime risk. For non-critical data it can hold up; for trade secrets and special categories of data, it should not be the only pillar.

Note: The DPF/Schrems status is volatile. Re-check the current state before making any far-reaching decision.

The four sovereignty tiers compared

In practice there is no “US vs. EU,” but a spectrum of four tiers. Each has its legitimate place — depending on how much protection your data needs.

Tier	Data control	CLOUD Act residual risk	Typical cost	Effort / latency	Suitable for
1. US cloud API (DPA + Zero Data Retention)	low	high	low (pay-per-token)	low, immediate	public/non-critical data
2. EU region of a US provider (Azure EU, AWS Frankfurt)	medium	medium–high (US parent)	low–medium	low	moderately sensitive data, with a DPA
3. Sovereign cloud (EU operator, EU law)	high	low	medium	medium	personal & confidential data
4. On-premise / local	full	none	high (CapEx)	high, self-maintained	trade secrets, special categories

The table reveals the real trade-off: the higher the sovereignty, the higher the effort and cost — but the lower the residual legal risk. The art lies in not forcing everything onto a single tier, but in deliberately mapping data classes to tiers (see the decision guide below).

Four sovereignty tiers for LLM deployment: from the US cloud API through EU regions and sovereign cloud to on-premise, with rising data control and falling CLOUD Act residual risk

The sovereignty spectrum: with each tier, control over location, access, and applicable law rises — and the CLOUD Act residual risk falls. The right choice maps each data class to the matching tier, rather than squeezing the whole company into one.

Which LLMs are hosted in the EU or are European?

The European model landscape has caught up considerably of late. The most important options:

Mistral AI (France) — powerful models, several of them under an open license (including Apache 2.0) and therefore self-hostable; alongside this, an EU-operated API.
Aleph Alpha (Germany) — the Pharia model family is optimized for German, French, and Spanish and is delivered explicitly for on-premise and sovereign-cloud operation (weights + inference runtime).
Teuken-7B (OpenGPT-X) — an open-source research model trained in 24 official EU languages (Fraunhofer and others), freely available via Hugging Face. The funding project ended in March 2025; the model remains usable.
DeepL, nele.ai, Black Forest Labs and other European specialist providers for specific tasks (translation, assistance, image).
EU-hosted US models via Azure EU, AWS Frankfurt, or Google Cloud DE — powerful, but with the CLOUD Act caveat from Tier 2: an EU region ≠ EU sovereignty as long as the operator is subject to US jurisdiction.

Honestly assessed: the strongest US models still lead on some tasks. For most business applications, however, the gap has become smaller than the marketing suggests — and for many use cases (summarization, classification, enterprise RAG over your own documents) European or local models are entirely sufficient.

On-premise / local LLM — when is it worth it?

What is on-premise AI?

On-premise AI means the model runs on your own hardware within your own network — the data never leaves the building. This makes the variant compliant by design: there is no third-country transfer, no external operator, and no CLOUD Act exposure. From a legal standpoint, this is the cleanest solution, because the risk is not merely safeguarded but structurally excluded.

What does a local LLM cost?

Honestly: it depends heavily on the desired model. A 7B-to-14B model runs usably on a single server GPU with sufficient VRAM (on the order of a low five-figure acquisition cost plus power and maintenance). Models in the 70B class and above need several high-end GPUs and quickly move into the high five- to six-figure range. On top of that come ongoing costs for operation, updates, and staff. There are no blanket guarantees here — the serious answer is sizing based on model size, number of users, and response-time requirements.

What local LLMs (still) cannot do

On-premise is no cure-all. The honest limits: open models do not reach top performance in every discipline, you bear responsibility for maintenance, security, and updates yourself, and scaling to many concurrent users costs hardware. Anyone who forces everything on-prem simply “because sovereign” may end up paying for control they do not actually need.

The decision guide: which model for which case

Instead of a one-size-fits-all answer, here is the logic I follow in my advisory work. Four axes determine the tier:

Data class — Is the data public/non-critical, personal, or a trade secret or special category (Art. 9 GDPR)?
Use case — Brainstorming and drafting text vs. processing real customer/patient/client data.
Latency & availability — Is an API enough, or must it run offline / with guaranteed availability?
Budget — CapEx-capable (own hardware) or OpEx-oriented (usage-based)?

This almost always leads to a hybrid approach: non-critical tasks run cost-effectively on a (properly configured) cloud API, while business-critical data stays on-prem or in a sovereign cloud. The bridge between them is a data sluice — an upstream gateway that detects personal or confidential content and anonymizes it or prevents it from leaving in the first place. This way you gain model quality without giving up sovereignty.

Rule of thumb: the closer the data sits to the core business and to real people, the further the choice shifts toward Tiers 3–4. Public content can happily stay on Tiers 1–2.

This architectural decision — the tier mix, the data sluice, the configuration — is exactly where legal obligation and technical implementation meet. On this strategic crossroads, I both advise and build.

Sovereignty is also AI Act governance

In 2026, data sovereignty is no longer just a GDPR question. The AI Regulation (EU AI Act) requires operators (deployers) of AI systems to ensure, among other things, data governance, transparency, and traceability. The timeline is real and already running: since 2 August 2025, the obligations for providers of general-purpose AI (GPAI) models have applied — evidence that the regulation is being phased into force in concrete steps rather than remaining abstract. Anyone who does not control where their data is processed and who can access it can hardly meet these governance obligations robustly. Data sovereignty thereby becomes a building block of AI Act compliance — not a separate topic. More on this in the overview of the EU AI Act for businesses.

FAQ

With limitations, yes. For non-critical data, US models can be used lawfully via a clean data processing agreement (DPA), a zero-data-retention configuration, and, where applicable, the DPF. The CLOUD Act residual risk nevertheless remains — for trade secrets and special categories of data, a more sovereign tier is therefore preferable.

Azure OpenAI can be configured in a privacy-friendly way via the EU Data Boundary and a DPA; by default, prompts are stored for up to 30 days for abuse detection. Eligible enterprise customers can disable this storage via the zero-data-retention program. Because Microsoft is subject to US jurisdiction, however, the CLOUD Act exposure (Tier 2) remains — an EU region is not full sovereignty.

Do I need a data protection impact assessment (DPIA)?

Often, yes. For AI systems that process personal data at scale or with high risk, Art. 35 GDPR requires a DPIA. Third-country transfer and automated evaluation are classic triggers. When in doubt, it is better to document a brief threshold assessment than to have to demonstrate one after the fact.

What is a data sluice or AI gateway?

An upstream system that filters all inputs to an LLM: it detects personal or confidential content, anonymizes or blocks it, and logs the data flow. This makes it possible to use strong (even external) models without sensitive raw data leaving the building — the technical bridge between model quality and sovereignty.

What does vendor lock-in mean in AI?

Dependence on a provider whose model, format, or pricing you can only switch away from at great effort. Sovereignty also encompasses this dimension: open models and portable architectures significantly reduce the lock-in risk.

Status and context

As of December 2025. The DPF/Schrems status (the pending CJEU proceedings in Case C-703/25 P) and the AI Act deadlines are volatile; this article is updated in the review cycle. This presentation is not a substitute for legal advice in an individual case — the right sovereignty tier depends on your specific data flows, use cases, and risk appetite.

If you do not want to leave this strategic decision to chance: I make the sovereignty and architecture decision together with you — as a business lawyer who then also builds the solution. You can find more about my background about me.

This is general information, not legal advice.

Sources — as of 15.12.2025

European Commission — Adequacy decision on EU-US data flows (July 10, 2023): https://germany.representation.ec.europa.eu/news/datenverkehr-zwischen-der-eu-und-den-usa-europaische-kommission-erlasst-neuen-2023-07-10_de
Solidaris / PUBLICUS — General Court dismisses the Latombe action (Sept. 3, 2025), DPF remains in force: https://www.solidaris.de/aktuelles/angemessenheitsbeschluss-fuer-die-usa-haelt-der-ueberpruefung-des-eug-stand
datenschutzticker.de — Appeal against the DPF at the CJEU (Case C-703/25 P), dispute back before the CJEU: https://www.datenschutzticker.de/2025/11/streit-ueber-us-angemessenheitsbeschluss-erneut-vor-eugh/
next-levels.de — EU-US Data Privacy Framework: status 2026 (PCLOB quorum): https://next-levels.de/wiki/eu-us-data-privacy-framework
jentis / noyb — Schrems III in preparation: https://www.jentis.com/blog/noyb-will-challenge-the-new-data-privacy-framework
EDPB — Opinion on Art. 48 GDPR & the US CLOUD Act (CMS summary): https://cms.law/de/deu/legal-updates/Positionierung-des-EDSA-zum-CLOUD-Act-Datenuebermittlungen-an-US-Ermittlungsbehoerden-nur-in-engen-Grenzen
datenschutzticker.de — EDPB: data transfer to third-country authorities under Art. 48 GDPR (06/2025): https://www.datenschutzticker.de/2025/06/edsa-datentransfer-an-drittstaatenbehoerden-nach-art-48-dsgvo/
Looming Tech — CLOUD Act / AWS EU Region & GDPR (“an EU region is no firewall”): https://www.looming.tech/post/cloud-act-aws-eu-region-gdpr
kiteworks — EU Data Act, GDPR & US CLOUD Act conflict: https://www.kiteworks.com/gdpr-compliance/eu-data-act-gdpr-cloud-conflict/
Fraunhofer IAIS — OpenGPT-X / Teuken-7B (24 EU languages, open source): https://www.iais.fraunhofer.de/en/industries_and_cross-sector_solutions/cross-sector_solutions/generative-ai/opengpt-x.html
Aleph Alpha Docs — Pharia-1-LLM-7B (on-premise delivery, DE/FR/ES): https://docs.aleph-alpha.com/products/pharia-1-llm/overview/
Meetily — Azure OpenAI Data Retention Policy 2026 (ZDR, 30-day abuse monitoring, region pinning): https://meetily.ai/llm-privacy/azure
CJEU — Schrems II (C-311/18, July 16, 2020): https://curia.europa.eu/juris/liste.jsf?num=C-311/18
Baker McKenzie — GPAI obligations under the EU AI Act apply from Aug. 2, 2025: https://www.bakermckenzie.com/en/insight/publications/2025/08/general-purpose-ai-obligations

EU-Hosted vs. US LLMs: Data Sovereignty, Third-Country Transfer & On-Premise — the Legal-Technical Guide (as of 2025)

What does data-sovereign AI mean?

Why US LLMs are a sovereignty risk