AI hallucinations, EU GDPR, and the importance of cautious optimism

Dec 05, 2024

The EU’s data protection law, the GDPR, requires accuracy in processing personal data. However, generative AI—like large language models (LLM)—may “hallucinate” or reflect information that is false but widely spread. On one hand, inaccuracy may seem like an inherent feature of the technology. On the other hand, some major LLM providers aspire to make their products more accurate and perhaps even to “organize the world’s information and make it universally accessible and useful,” to borrow Google’s famous mission statement. How we choose to think about this matters. One reason why it does is that it may significantly affect how data protection law will be applied to AI.

Consider the following view, represented by some privacy activists:

LLMs can be prompted to produce inaccurate outputs about identifiable individuals.
At least some cases of (1) constitute breaches of the GDPR (specifically, its principle of accuracy).
LLMs could only be provided lawfully in the EU by making (1) technically impossible.
(3) is not currently feasible.
LLMs are illegal under the GDPR.

Let’s consider this view while reflecting on the value of LLMs from the perspective of EU law and how legal interpretation may be coloured by the narratives we adopt about the technology. I show how the analogy with how EU data protection law accommodated search engines is also relevant for accuracy under the GDPR. I also suggest that cautious optimism is the right approach to account for the not-yet-realised potential of the technology.

LLMs can be prompted to produce inaccurate outputs about identifiable individuals

It’s possible to prompt LLMs to produce outputs that are arguably related to identifiable individuals but not true about those individuals. Not every instance of this is relevant for the GDPR—as I noted in Should the GDPR Prohibit AI? Notes on EU's AI & GDPR stakeholder event:

… the fact that we can get personal data in the outputs of a model does not necessarily mean the model is processing personal data. Indeed, this can happen for at least two other reasons:

Users might be providing personal data directly in their prompts; and
Models often generate outputs that appear to be personal data—whether true or fictitious—based not on “memorized” information, but on statistical probability.

For example, if asked about “John Smith,” a model might assume the person is male and, if given a job title, might infer a probable birth year based on implied seniority.

However, I also noted that “there are cases where models output personal data not implied by prompts.” This happens to public figures and others whose “information appears so frequently in training data that it becomes represented in model weights and can be retrieved through specific prompting.” I don’t think I’m a public person in a traditional sense. Still, I appear on the internet in quite a few places, which I’m guessing explains why I can ask LLMs about myself and get results that are arguably related to me without being implied by my prompts.

Are inaccuracies GDPR breaches?

Just because an LLM outputs inaccurate information about someone does not mean a GDPR violation occurred. A contrary interpretation would treat accuracy as an absolute requirement, ignoring that even the rights to privacy and data protection that the GDPR aims to protect are not absolute.

Can any case of an LLM producing inaccurate information be a GDPR violation? Conceivably, yes. This could be the case where the LLM service provider fails to take proportionate steps reasonably within their powers and capabilities to reduce that risk. The crucial question is, what can be expected from service providers? Where should we draw the line?

The standard operating procedure for LLMs, at least those developed by major US and EU players, includes practices for both development and deployment:

automated removal of personal data from training data,
use of synthetic data,
training techniques (e.g., regularization methods to improve generalization and reduce overfitting),
post-training techniques (e.g., reinforcement learning from human feedback (RLHF) to “teach” the model to refuse to answer questions about people who are not public persons),
deployment safeguards (e.g., prompt/output filtering, soliciting user feedback),
accountability mechanisms (procedures for implementing the techniques listed above, documentation, employee training, and so on).

We can hope that those and other techniques will be continuously improved.

American and European AI providers are trying to balance offering useful services and reducing the risk of outputting inaccurate data about real individuals. The legal question is whether, under the GDPR, they strike the right balance.

The value of LLMs under EU law

To answer this question, we need to consider the value of LLMs from the perspective of EU law. Even granting the obvious point that the GDPR principle of accuracy is not absolute, the opponents of LLMs could argue that LLMs bring so little value that even the most restrictive measures to ensure LLM accuracy would be proportionate. Hence, it is also important how we think about LLMs—what narrative, mental model do we build? Simplifying, I’ll suggest two such models.

First, we could focus on LLMs as inherently “random” or “probabilistic.” From this perspective, the value of LLMs is in surprising us, providing creative solutions, and even helping us express ourselves. These are not trivial considerations, and they may arguably engage the protection of the fundamental freedom of expression under the EU Charter of Fundamental Rights.

Moreover, adopting this framing could provide a strong argument that the requirement of accuracy does not apply directly to LLM outputs, as they are not the kind of expression meant to be assessed as true or false. They are not precisely opinions or caricatures, but they may call for similar legal treatment as those more traditional, not-truth-oriented forms of expression.

However, it could also be argued that if this is the main benefit LLMs bring, further restrictions on LLM development and deployment could be proportionate. Perhaps this could involve some limitations on what training data can be used (e.g., excluding websites which are challenging to anonymise). Also, perhaps LLM providers should do more to present LLMs as tools that should not be treated as giving factual information. To some extent, this is already being done with “small print” in chat interfaces and even in model outputs:

Google Gemma 2 27B IT

Perhaps, at some cost to model usefulness, such warnings and refusals could be made more prominent and frequent.

Second, we could see LLMs as tools to “organize the world’s information and make it universally accessible and useful.” I intentionally quoted the mission statement of Google Search. EU courts already recognised search engines’ pivotal role in facilitating European freedom of expression and information. This was, in fact, a key reason why the courts showed flexibility in applying EU data protection law to search engines, taking into account the powers and capabilities of their operators.

Users already treat LLMs like search engines, asking for information, and even as search engines, requesting links to relevant internet sources. LLM service providers are responding to this demand, aiming to make their products more accurate and to give users access to more information. Inaccuracy remains an issue for now, but it is being treated as a challenge to be overcome.

From this perspective, even the additional limitations that may seem proportionate on the first framing (thinking of LLMs as chiefly “creativity tools”) are likely to be disproportionate. This is especially the case with limitations that may conflict with accuracy (like limiting training data sources).

In fact, LLM developers may, in some ways, already be doing more than what is required by EU data protection law. There is an argument to be made that removing personally identifiable information from training data makes those tools less useful for access to information. We also do not impose such a requirement on search engines—it is widely accepted, and not seriously questioned under EU data protection law, that you can google people. Yes, we have procedures for “delisting” at the request of the person concerned. Still, even that typically does not involve removing the personal data from search indices (e.g., a “delisted” name may still appear in search results in a website title or a text snippet—in response to a search query that doesn’t include the name).

The importance of cautious optimism

As I argued in The GDPR and GenAI — Part 1: Lawful Bases, much is to be learned from how EU data protection law accommodated search engines. When confronted with search engines, the EU Court of Justice demonstrated remarkable foresight by choosing not to apply data protection law in its strictest interpretation. However, the Court made that determination when the search technology was already quite mature (in 2014).

Today’s LLMs stand at a similar crossroads to where search engines found themselves in the late 1990s—as emerging technologies with enormous potential but also significant uncertainties. If we imagine ourselves in 1998, considering how to regulate search engines, it becomes clear how shortsighted it would have been to impose overly restrictive regulations that might have stunted their development. The benefits we now take for granted from search technology might never have materialized under a more stringent regulatory regime.

The search case teaches us that cautious optimism often serves society better than restrictive scepticism when dealing with transformative technologies.

What does this mean for the value of LLMs and our mental models of what LLMs are for? At first glance, the first mental model I discussed may seem to fit the current technology limitations: indeed, LLMs sometimes produce inaccurate outputs. However, the second model reflects how people want to use LLM-powered services (to organise and access information) and the direction in which the technology develops. Perhaps, even with significant technological improvements, there will always be some inaccuracies - as with the information we can access through search engines. The tools do not need to be perfect to be helpful, even pivotal, for access to information.

We can adopt the second mental model of LLMs’ value if we accept a cautiously optimistic approach. Accordingly, we should then apply EU data protection law in a way that fully considers the very significant value this is likely to bring from the perspective of fundamental rights. This leads to a legal interpretation that shows flexibility, similar to the case with search engines.

How EU Law Influences Tech

Discussion about this post

Ready for more?