AI & Law
RAG vs. Fine-Tuning for Enterprises: When to Use Which, Costs, Data Protection, Maintenance (As of 2026)
As of January 2026. RAG (Retrieval-Augmented Generation) supplies your knowledge to the AI model only at runtime, from a searchable database; fine-tuning trains knowledge or behavior permanently into the model weights. For most companies, RAG is the faster, cheaper, and — from a data protection standpoint — more manageable path. Fine-tuning pays off mainly for style, format, and domain language. This guide will make you decision-ready: when to use which, what it costs, how demanding the maintenance is — and why the choice of method is also a GDPR decision that hardly anyone thinks through cleanly.
What is RAG?
In brief: At runtime, RAG appends relevant excerpts from your own documents to the language model before it answers. The model itself stays unchanged.
Picture an open-book exam: during its answer, the model is allowed to look things up in your company wiki, in contracts, manuals, or tickets. Technically, your documents are broken down into embeddings (vectors) and stored in a vector database. When a question comes in, the system retrieves the most relevant passages and hands them to the model as context. The result: up-to-date answers with source citations — and a markedly lower risk of hallucination, because the model quotes from documented material rather than guessing freely.
What is fine-tuning?
In brief: Fine-tuning retrains the model on your example data so that the new knowledge or behavior lives permanently in the model weights.
This is the closed-book exam: the model has “memorized” everything and answers from memory, without looking anything up. Modern techniques such as LoRA/PEFT don’t retrain the entire model but only small additional layers — which substantially lowers cost and compute requirements. Fine-tuning shines when the model needs to reliably hit a specific tone, a fixed answer format, or a specialized vocabulary. It is weak, by contrast, when content changes frequently: every knowledge update means another training run.
What is the difference between RAG and fine-tuning?
The core in one sentence: RAG changes what the model knows; fine-tuning changes how the model thinks and speaks. RAG delivers facts and currency at runtime; fine-tuning imprints behavior, style, and format permanently. The table below contrasts the decisive dimensions.
| Dimension | RAG | Fine-Tuning |
|---|---|---|
| How it works | Knowledge mixed in at runtime from a database | Knowledge/behavior trained permanently into the weights |
| Data currency | instant (just swap the document) | only via another training run |
| Setup effort | medium (vector DB, indexing, retrieval) | high (data preparation, training pipeline) |
| Ongoing costs | medium (storage, embeddings, queries) | lower per query, but re-training has a cost |
| Maintenance | maintain documents/index, no retraining | re-training pipeline + versioning + eval |
| Latency | somewhat higher (retrieval step) | lower (no lookup) |
| Hallucination risk | lower (documented material) | higher for factual questions |
| Source citations | yes, a reference per answer | no |
| Data control/erasability | high — data can be deleted in a targeted way | low — data is bound up in the weights |
| Typical use case | knowledge assistant, company GPT, support | tone, formats, domain language, a fixed task |
When RAG, when fine-tuning, when both?
The rule of thumb that holds in most SME projects: Content, facts, and currency → RAG. Style, format, and behavior → fine-tuning. The two are not mutually exclusive.
Choose RAG when …
- your knowledge is volatile and changes frequently (prices, policies, product data),
- you need source citations (compliance, liability, traceability),
- you want to go live fast,
- you want to make many documents searchable,
- personal data is involved and must remain selectively erasable.
Choose fine-tuning when …
- a fixed tone, style, or strict answer format is required,
- the model must reliably master a domain language,
- low latency at very high query volumes matters,
- the task is stable and the knowledge barely changes.
Combine both (hybrid / RAFT) when …
… you need both current factual knowledge and precise behavior. Approaches such as RAFT (Retrieval-Augmented Fine-Tuning) train a model specifically to handle retrieved context better. This is powerful but more expensive and maintenance-heavy — usually worthwhile only once a pure RAG approach hits its quality ceiling.
What do RAG and fine-tuning cost?
Up front, because the market loves to advertise flat rates: there is no credible blanket price. Costs depend on data volume, quality requirements, model choice (cloud API vs. open-source on-prem), and depth of integration. The ranges below are orders of magnitude from the market — illustrative, not an offer.
| Cost block | RAG | Fine-Tuning |
|---|---|---|
| Setup (project) | order of ~€30,000, higher depending on scope | usually significantly higher (data prep + training) |
| Ongoing | ~ a few hundred to ~€2,000/month (infrastructure) | lower per query, but re-training on every update |
| Main cost driver | vector DB, embeddings, query volume | training compute (GPU), data curation, eval |
A back-of-the-envelope example circulating in the industry puts a mid-sized company with around 500 employees at roughly ~€30,000 setup + ~€2,000/month for RAG, while fine-tuning projects run higher depending on depth and require another training run on every document change (ai11.io). Read this as a ballpark, not a price tag — your actual effort may come in below or above it.
The economic punchline: RAG often has more moderate up-front costs but ongoing infrastructure costs; fine-tuning shifts the effort into an expensive training phase and into every later update. With frequently changing knowledge, that quickly backfires.
Open source cheaper? Fine-tuning on small open-source models can lower license/API costs and keep data in-house — but you trade them for GPU, operations, and MLOps effort. “Cheaper” only holds if you realistically account for these downstream costs. The trade-off in detail: Open source vs. proprietary LLMs.
RAG vs. fine-tuning from a data protection (GDPR) perspective — the decisive difference
This is where what hardly any tech comparison thinks through cleanly gets decided: the choice of method is a data protection decision. The moment personal data is involved, RAG and fine-tuning behave fundamentally differently.

The practical heart of the GDPR question: in the vector database a record is an addressable row you can delete. In the weights of a fine-tuned model, the same information is a distributed statistical pattern — there, an erasure request can force a complete re-training.
Which approach is more GDPR-compliant?
In brief: As a rule, RAG is more manageable from a data protection standpoint, because the reference data sits directly addressable and selectively erasable in the vector database — whereas fine-tuning binds data into the model weights, where it can hardly be removed selectively. But “more manageable” does not mean “automatically compliant”: RAG creates its own obligations too.
Can personal data be deleted again from a fine-tuned model?
Technically, this is the sore point of fine-tuning. Personal data does not sit in plaintext within a language model but is distributed as statistical patterns across numerous weights — which makes access and selective deletion technically difficult. Through so-called memorization, such training data can even be reproducible later on. The German Federal Commissioner for Data Protection (BfDI — Bundesbeauftragter für den Datenschutz und die Informationsfreiheit) describes the safest solution as the regular replacement of the model with one newly trained — without the data to be deleted (BfDI guidance on AI, 22 Dec 2025). In other words: a single erasure request under Art. 17 GDPR can trigger a complete re-training. With RAG, it is usually enough to remove the affected record from the vector DB.
What does the German Data Protection Conference (DSK) say about RAG?
The Datenschutzkonferenz (DSK — the conference of the German data protection supervisory authorities) published an 18-page guidance document on AI systems using RAG on 17 October 2025 (DSK press release, datenschutz.sachsen.de). Three points from it are central to the choice of method:
- Data subject rights in all components. Access, rectification, and erasure must be “implemented at all times in all components of the RAG system” — that is, in the retriever and in the LLM component. Dynamically generated content can make implementation harder (SKW Schwarz, KI-Flash on the DSK guidance).
- RAG does not cure an unlawful model. The DSK makes clear: an unlawfully trained base model remains unlawful, even when deployed within a RAG system. RAG is not a compliance band-aid over a problematic foundation model. This fits the fact that, since 2 August 2025, the obligations for providers of general-purpose AI (GPAI) models under the EU AI Act apply — including technical documentation and a summary of the training data (EU AI Act, implementation timeline). The legal soundness of the base model is thus increasingly subject to scrutiny.
- Purpose limitation and transparency. The modular architecture makes the clear allocation of purposes and the provision of information to data subjects harder, because the origin of the embeddings and how the output came about are often not traceable. The DSK recommends a clear definition and separation of purposes for the sources integrated.
Does a RAG system need a DPIA and a legal basis?
Every processing of personal data needs a legal basis under Art. 6 GDPR — that also applies to the individual steps of a RAG system (data preparation, provision of the reference documents, output generation). The DSK additionally recommends carrying out a data protection impact assessment (DPIA) under Art. 35 GDPR “taking into account all components of the RAG system” (SKW Schwarz). Whether a DPIA is mandatory depends on the individual case — for extensive or sensitive data holdings it is regularly indicated. How a DPIA for AI systems is structured in practice, I cover in DPIA for AI systems.
Note: This is a general legal assessment as of January 2026, not legal advice or representation in an individual case. The legal situation is in flux; review your specific use case individually.
Maintenance & operations compared
In brief: RAG maintenance means tending documents and the index; fine-tuning maintenance means retraining models. That is the biggest hidden cost difference over time.
With RAG, you keep the knowledge base current: index new documents, re-embed changed ones (re-embedding), remove deleted ones, and monitor retrieval quality. This is a low-to-medium ongoing load — and, above all, no model retraining. A new manual is available within minutes.
With fine-tuning, every relevant knowledge change means a re-training pipeline: curate data, train, evaluate, version, roll out. This is more demanding and unsuitable for volatile content. Fine-tuning earns back its maintenance costs only if the trained-in behavior stays stable.
How much data does fine-tuning need?
There is no fixed threshold, but fine-tuning lives on quality over quantity: a few hundred to a few thousand carefully curated example pairs often deliver more for a well-defined task (e.g., a fixed answer format) than huge, messy data volumes. If your use case comes down to pure factual knowledge, RAG is almost always the more direct path — there you need no training data, only your documents.
Conclusion & recommendation
For most enterprise knowledge applications — company GPT, knowledge assistant, support — RAG is the right default choice: faster to go live, more predictable on cost, with source citations and — crucially — with data that can be deleted in a targeted way. You add fine-tuning where style, format, or domain language matter. Reach for the hybrid when both are demonstrably necessary — not as a matter of principle.
And the point almost everyone overlooks: the choice of method is also a data protection course-setting decision. Anyone who trains personal data into model weights risks a single erasure request forcing a re-training. This decision is best made before the project is built.
Unsure which approach is legally sound and economically right for your use case? Let’s sort it out in an initial conversation — with a business lawyer who then also builds the solution himself. More on this under AI consulting.
FAQ
What is the difference between RAG and fine-tuning? RAG supplies your knowledge to the model at runtime from a database and leaves the model unchanged. Fine-tuning trains knowledge or behavior permanently into the model weights. RAG changes what the model knows; fine-tuning changes how it answers.
Which is cheaper — RAG or fine-tuning? RAG usually has the more moderate up-front costs and, in return, ongoing infrastructure costs, while fine-tuning shifts effort into an expensive training phase and into every later update. Exact figures depend on the individual case; the market examples in circulation are only orders of magnitude.
Can personal data be deleted again from a fine-tuned model? Only with difficulty. The data is distributed as statistical patterns across many weights and, through memorization, is partly reproducible. The BfDI names retraining without the affected data as the safest solution — so in practice an erasure request can trigger a re-training.
Which approach is more GDPR-compliant? RAG, tendentially, because reference data in the vector database can be deleted in a targeted way. But “more manageable” does not automatically mean compliant: RAG too needs a legal basis, purpose limitation, and possibly a DPIA — and according to the DSK, an unlawfully trained base model remains unlawful even with RAG.
Can RAG and fine-tuning be combined? Yes. Hybrid approaches such as RAFT train a model to handle retrieved context better. This combines current factual knowledge with precise behavior but is more expensive and maintenance-heavy — usually worthwhile only once pure RAG hits its quality ceiling.
Sources — as of 24.01.2026
- DSK — guidance on AI systems using RAG (17 Oct 2025): DSK press release (Saxony)
- Assessment of the DSK RAG guidance: SKW Schwarz — KI-Flash
- BfDI — guidance on AI (22 Dec 2025, PDF): bfdi.bund.de
- EU AI Act — GPAI obligations since 2 Aug 2025, implementation timeline: artificialintelligenceact.eu
- Art. 17 GDPR (right to erasure): datenschutz-grundverordnung.eu
- Market comparison/cost orders of magnitude: ai11.io, zweitag.de