01 AI Consulting 02 Software Development 03 About 04 Blog
DE EN
Arrange a call
All posts

AI & Law

Privacy in the Prompt: What Really Happens to Your Company Data in ChatGPT & Co.

A single copy-and-paste moment decides whether your company has a data-protection or trade-secret problem: the instant someone pastes a draft contract, an applicant list, or a strategy paper into a chat window, that content leaves your control — transmitted to third-party servers, processed there, logged, and, depending on the plan, reused to train a model. Most of the debate revolves around the wrong question (“Does the AI remember this?”). The right one is: what does the company behind the AI do with what I enter — and on what legal basis?

This article answers that from two perspectives that are usually negotiated separately: the technical one (where the data actually flows) and the legal one (GDPR, data processing agreement, third-country transfer, trade and professional secrets). The result is not a list of prohibitions but a robust sequence that lets you deploy AI productively and audit-proof.

The Journey of a Prompt — What Technically Happens to Your Input

Before you can answer the legal question, you have to understand where the data actually flows. A prompt typically passes through six stations:

  1. Input — you type or paste text (or a document, an image) into the tool. At this moment, the data has not yet left your control.
  2. Transmission — the input is encrypted (TLS/HTTPS) and sent to the provider. Transport encryption protects against eavesdropping in transit — but not against what the provider does with the data afterward.
  3. Processing/inference — the text is broken into tokens and processed by the model to generate a response.
  4. Logging & retention — the provider generally stores the input for a defined period, for instance to detect abuse or to display your chat history.
  5. Optional model training — depending on the plan, inputs are used to improve the model (more on this shortly).
  6. Storage location/third-country transfer — processing often takes place on servers outside the EU, typically in the United States.

One common misconception is worth clearing up: the model does not “remember” your individual prompt on its own and will not regurgitate it to the next user. The model’s weights are fixed before your input; a single prompt does not change them. The risk lies one layer deeper — with the provider that stores the input, logs it, and, depending on the plan, collects it for a later training run. What matters in data-protection terms is therefore not the model architecture but the processing and retention policy of the company behind it.

Six-station data journey of a prompt, from input through transmission, processing, logging and training to storage in a third country outside the EU

The six stations of a prompt: the actual data-protection risk only arises at station 4 (logging) and 5 (training) — not in the model itself.

Are My Inputs Used to Train the AI Model?

That depends on the plan. On the business tiers of the major providers, inputs are not used for training by default; on the free or purely personal tiers, the opposite is true — there, using inputs for model improvement is the norm, though you can opt out.

Specifically with OpenAI (as of March 2026): data from ChatGPT Enterprise, Team/Business, and the API is, according to the provider, not used for training by default, and you can sign a DPA (Data Processing Agreement — the same instrument as the German AVV, under its English name). With Free and Plus, training on your inputs is the default; you can opt out in the privacy settings.

ChatGPT tierDPA available?Training on inputs (default)Suitable for company data?
Free / Plusnoyes (opt-out possible)no
Team / Businessyesnowith a policy
Enterpriseyesnoyes
APIyesnoyes

These values change. Verify them before any tool decision against OpenAI’s current help center or Enterprise Privacy page (sources below). Other providers — Microsoft Copilot, Anthropic Claude, Google Gemini, Mistral (EU) — have their own, differing rules.

The difference between “yes” (Enterprise/API) and “with a policy” (Team/Business) is not about the contract — the data processing agreement/DPA and the training exclusion apply in both cases. It is about control and administration features: Enterprise and the API offer central management, audit tooling, and in part zero data retention, so the data flow can be enforced organization-wide. Team/Business covers the same legal framework but leaves day-to-day enforcement more to an internal usage policy — without one, a gap remains between the contract and actual behavior.

Two subtleties that often get lost in practice:

  • “No training” does not mean “no storage.” Even when a provider does not use your inputs for training, it typically retains them for a retention period (e.g., 30 days for abuse detection) and may expose them to staff or sub-processors for review. For especially sensitive data, some providers allow a zero-data-retention configuration — that, not the training toggle alone, is the real lever.
  • Opt-out is not retroactive. If you opt out of training in the Free/Plus settings, it applies from the moment you opt out — inputs that already fed into a training run cannot be recalled. Data minimization up front beats any after-the-fact setting.

Where Does the Data Live — and Does the GDPR Even Apply There?

The GDPR “travels with you”: as soon as you, as a European company, process personal data, it applies regardless of where the server sits. But if the server is outside the EU/EEA — the norm with the major US providers — Chapter V of the GDPR additionally comes into play: the third-country transfer (Art. 44 et seq.).

For transfers to the United States, the EU-U.S. Data Privacy Framework (DPF) is the central basis. The EU Commission recognized it by adequacy decision on July 10, 2023; on September 3, 2025, the General Court of the EU dismissed an action seeking its annulment. As things currently stand, the framework is therefore valid — but politically watched (among other things over concerns about oversight by the US body PCLOB). In practice, this means the US provider must be DPF-certified; otherwise, you need other safeguards (such as standard contractual clauses plus supplementary measures).

A second factor is the US CLOUD Act, which may grant US authorities access to data held by US providers — potentially even data physically located in an EU data center, as long as the provider is subject to US jurisdiction. Exactly how far that access reaches in conflict with the GDPR is legally contested and the subject of ongoing debate; what is undisputed is that an EU region does not fully remove the issue as long as the parent company is subject to US law. This is precisely why genuine EU data-residency options and European models are a risk lever in their own right, not a marketing gimmick — a trade-off I work through in detail in my comparison of EU-hosted and US LLMs.

A Current Special Case: The NYT Retention Order Against OpenAI

In the litigation between The New York Times and OpenAI, a US court ordered in May 2025 that OpenAI must retain chat logs — including ones that would otherwise have been deleted. According to reports, this blanket order was lifted again for new logs from late September 2025; however, data already preserved, as well as data from accounts flagged by the NYT, must still be retained, and OpenAI is appealing. The case illustrates a real tension: a court-mandated retention can collide with the GDPR right to erasure (Art. 17). State of play: an ongoing US proceeding — not a permanent German fixture, but an argument for data minimization.

Not legal advice for your specific case. This article explains the situation in general terms and to the best of our knowledge as of the stated date. It does not replace an individual assessment of your particular circumstances.

Do I Need a Data Processing Agreement? — Processing on Behalf for AI

In short: as a rule, yes. When an AI provider processes personal data on your behalf, it is a processor within the meaning of Art. 28 GDPR — and then a data processing agreement (Auftragsverarbeitungsvertrag, AVV) is mandatory, and it must be in place before the first processing. Without one, the processing is unlawful, no matter how technically secure the tool is.

Important: providers offer the agreement only for their business tiers (with OpenAI, that means Team/Business, Enterprise, API). There is none for Free/Plus — which is one reason these tiers are unsuitable for genuine company data.

The legally sharper question, though, is: controller or processor? If the provider processes solely on your instructions and for your purposes, it is a processor — and the Art. 28 agreement fits. If it pursues its own purposes with the data (product improvement, model training, security analytics at its own discretion), then to that extent it becomes an independent controller or a joint controller (Art. 26) — and a processing agreement precisely does not cover that part. This is the most common mistake: a signed DPA suggests safety even though the training or analytics clauses turn the provider into a controller with its own duty to find a legal basis. So read the contractual schedule, not just the marketing promise.

A practical checklist before the first prompt containing personal data:

  • Data processing agreement/DPA concluded and countersigned (not merely “available”)?
  • List of sub-processors reviewed and approved?
  • Storage location/data residency and third-country safeguard (DPF certification or SCCs) documented?
  • Retention period and deletion concept clarified (keyword: zero data retention)?
  • Training/analytics use contractually excluded — and thus no covert controller role?

Shadow AI — The Real Risk

Shadow AI refers to the use of AI tools by employees without the organization’s approval and knowledge — for example, when someone pastes a confidential draft contract into the free personal version of ChatGPT “just to quickly summarize it.” It usually arises not from bad intent but from a gap: the company provides no official, approved tool, so people reach for what they know from private use.

The risk is concrete and multilayered:

  • Data-protection breach: third parties’ personal data ends up with a provider without a legal basis and without a data processing agreement.
  • Loss of trade-secret protection: legal protection under the German Trade Secrets Act (Geschäftsgeheimnisgesetz, GeschGehG, which implements the EU Trade Secrets Directive) requires reasonable confidentiality measures. Anyone entering secrets into a public tool can lose that protection.
  • Breach of professional confidentiality: in law firms, tax advisory practices, or medical practices, entering client or patient data can implicate sec. 203 StGB (German Criminal Code; breach of private secrets).

The scale is documented: in the 2024 Work Trend Index by Microsoft and LinkedIn (31,000 respondents across 31 countries), 78% of employees who use AI at work reported bringing their own, unsanctioned tools (“Bring Your Own AI”) — rising to 80% at small and medium-sized companies. The exact figure varies by survey and definition, but the order of magnitude is consistently high. The takeaway: a ban alone creates no security; it merely creates shadow AI. Security comes from an approved tool plus clear rules. How to rein this in concretely is covered in my dedicated piece on shadow AI in the company.

Data You Should Never Enter Into a Public AI

As a reliable reflex — the following content does not belong, unfiltered, in a public, unvetted AI tool:

  1. Third parties’ personal data without a legal basis (customer names, applicant data, employee data).
  2. Specially protected data under sec. 203 StGB — client, patient, and counseling data.
  3. Trade secrets — strategy papers, costings, source code, unprotected inventions.
  4. Credentials — passwords, API keys, tokens.
  5. Unredacted contracts and documents containing identifiable individuals.

The mandatory reflex beforehand is called anonymization or pseudonymization: remove names, file reference numbers, and identifying details, or replace them with placeholders (e.g., “Person A”, “Company X”, “Date D”), before anything goes into a tool — unless you are using a business tool with a data processing agreement and training switched off. Two pitfalls here: first, pseudonymization is no free pass — pseudonymized data remains personal data as long as the mapping table exists. Second, a mosaic of individually harmless details (industry + region + revenue band + project name) can make a person or company re-identifiable. Anonymization in the legal sense is reached only when re-identification is practically impossible — a higher bar than merely redacting the name.

How to Use AI in Your Company in a GDPR-Compliant Way

The GDPR and productive AI use are not mutually exclusive — what matters is the sequence.

  1. Choose a tool tier with a data processing agreement — Enterprise/Team/Business/API instead of Free/Plus.
  2. Disable training and verify it — don’t assume; check it in the settings/contracts.
  3. Clarify storage location/data residency — an EU region or a DPF-certified provider; for highly sensitive data, consider European or local models.
  4. Create an AI usage policy — what may go in, what may not, which tool is approved. This is the most effective measure against shadow AI; I show the structure in the AI policy guide.
  5. Training — empower employees (also in the spirit of the AI-literacy obligation under Art. 4 EU AI Act, applicable since 2 February 2025).
  6. Check whether a data protection impact assessment (DPIA) is needed (Art. 35 GDPR) — when it applies is covered in the DPIA guide for AI systems.
  7. Involve the works council where co-determination applies (see FAQ).

This very chain — from the law through the policy to the architecture — is the point at which most projects fail, because they fall apart: the law firm writes the policy, IT builds the tool, and a gap yawns in between. I work at both ends — legal assessment and technical implementation — and thereby close the gap where policy and architecture would otherwise drift apart.

FAQ

What happens to my data when I enter it into ChatGPT?

The input is transmitted encrypted to the provider’s servers (usually in the US), processed there, and stored for a defined period. Depending on the plan, it may be used to improve the model: on business tiers not by default, on Free/Plus by default (opt-out possible).

Are my inputs used to train the model?

With ChatGPT Enterprise, Team/Business, and the API, inputs are, according to OpenAI, not used for training by default. On the free or personal tiers, training is the default, which you can opt out of in the privacy settings.

Do I need a data processing agreement for ChatGPT?

As a rule, yes, as soon as personal data is processed. The provider is then a processor under Art. 28 GDPR, and the data processing agreement is mandatory — before the first processing. An agreement (AVV/DPA) exists only for the business tiers, not for Free/Plus.

Is the free version or ChatGPT Plus permitted for company data?

As a rule, no. There is no data processing agreement for these tiers, and inputs are used for training by default. They are therefore unsuitable for personal data or trade secrets.

What is shadow AI and why is it a privacy risk?

Shadow AI is the use of AI tools without the organization’s approval. It is risky because personal data leaks out without a legal basis and without a data processing agreement, trade-secret protection can lapse, and in a law firm or practice, sec. 203 StGB can be implicated.

Does the works council have to co-determine AI use?

Often, yes. As soon as an AI tool is capable of monitoring employees’ conduct or performance, a works council co-determination right under sec. 87 BetrVG (German Works Constitution Act) may be triggered. Involvement should happen early, not only after rollout.

Conclusion

Privacy in the prompt is not an abstract risk but a concrete chain of decisions: the right tool tier, a data processing agreement, training off, storage location clarified, policy and training. The greatest danger is not the AI’s technology but the uncontrolled input — shadow AI. Those who keep to the sequence use AI productively and stay audit-proof.


As of March 2026. This article is general information, not legal advice for your specific case. — Leon Lotz, business lawyer (MusketierSoftware).

Sources — as of 05.03.2026
Leon Lotz

Leon Lotz

Leon Lotz is a business lawyer and founder of MusketierSoftware. He combines legal depth with real software craft.