16.06.2026

Carina Branco authors the article "Data Pseudonymisation in the AI Era"

For years, data pseudonymisation served as a regulatory grey area that the industry exploited to reduce GDPR friction in the processing of data for artificial intelligence purposes. A recent decision by the French data protection authority (CNIL) and the Council of the European Union's withdrawal of a key proposal from the Digital Omnibus package have closed that door. Carina Branco, Partner at Morais Leitão, attended IAPP AI Governance Global Europe 2026 in Dublin and analyses what has changed and what remains open.

At the end of the first day of the Congress, under the title Pseudonymity and AI — The New Frontier of Responsible Data, Graham Doyle of the Irish DPC, Monisha Varadan of Google, and Claude-Etienne Armingaud of Latournerie Wolfrom set the pace for an audience eager to understand whether the regulatory ground was about to shift beneath our feet.

The Council of the European Union removed from the Digital Omnibus package the proposal for a “relative” definition of personal data, which would have allowed pseudonymised data transferred to third parties without access to the keys to fall outside the scope of the General Data Protection Regulation (GDPR) for the purposes of Artificial Intelligence (AI) training.

The industry had been placing considerable hope in this escape valve. However, only a week before this conference, the French Data Protection Authority (CNIL) had already reinforced the Omnibus withdrawal by sanctioning¹ IQVIA OPERATIONS FRANCE, a subsidiary of the IQVIA group, in proceedings where pseudonymisation had been used as an argument for circumventing the GDPR².

The CNIL’s restricted committee rejected the company’s position essentially on two grounds.

First, CNIL dismissed the arguments based on relative anonymity³, which rest on the premise that pseudonymised data transferred to a recipient without access to the keys and without realistic or lawful means of reversing the identity can be treated as anonymous. CNIL concluded that IQVIA possessed the resources and means necessary for re-identification. Consequently, anonymity was broken, and the data had to be regarded as personal data throughout the processing chain.

The criterion of mere inaccessibility of the “keys” required for re-identification — in the sense that they only needed to be kept separate and protected for the data to continue being handled as anonymous — lost its persuasive force.

Secondly, it was found that the unique identifiers enabled longitudinal tracking of health journeys and that combining clinical data with publicly available datasets made re-identification possible through AI models and systems readily accessible to IQVIA.

The CNIL decision is not merely another enforcement action; it is a landmark decision, and it is worth understanding why through four specific aspects.

First, it expressly rejected the attempt to instrumentalise the CJEU’s SRB judgment as a “shield” against the GDPR. IQVIA sought to rely on European case law to move outside the regulatory perimeter, but the restricted committee closed that door with surgical precision.

Secondly, it applied the re-identifiability test rigorously and contextually, assessing the actual means available, the data effectively collected, and the external sources that could realistically be accessed.

Thirdly, it confirmed liability for failure to provide information throughout the entire data-processing chain, regardless of whether subcontracting or joint controllership arrangements existed. In this case, pharmacies failed to inform customers, and IQVIA was held responsible for that omission.

Fourthly, it sanctioned the absence of privacy by design at software level, notably the lack of multi-factor authentication within the EMR repository, which should have been addressed from the outset and by default.

This decision requires the recalibration of several key elements across all privacy systems:

Privacy notices. All potential categories of recipients, including recipients of pseudonymised data, must be clearly identified.
Data processing agreements. All DPAs must correctly reflect the treatment of pseudonymised data as personal data.
Retention of responsibility despite pseudonymisation. Even where the recipient cannot re-identify data subjects, the controller remains under an obligation to inform them about data sharing because, from the controller’s perspective, the data remains personal data.
Onward transfers. Where a processor or sub-processor possesses reasonable means to re-identify individuals, the data becomes personal data for them as well, exposing the controller who placed the data into circulation.

At this point, it is worth reflecting.

For years, pseudonymisation served as the technical shield that allowed organisations to invoke a grey area in which data could be processed with less regulatory friction. The Council of the EU removed that legislative route, CNIL imposed sanctions, and the panel discussion at 4 p.m. on 3 June in Dublin revealed new perspectives on the issue of re-identification.

CNIL’s reasoning was not directed at any particular technical weakness within IQVIA. This is all the more concerning given that, in France alone, the esante.gouv.fr portal recorded, as of September 2025, 125 authorised health-data repositories⁴ managed by 102 different operators — pharmaceutical software providers, public research bodies, insurance platforms and hospital organisations — all relying on processes and techniques very similar to those employed by IQVIA, including pseudonymisation.

The French regulator is now forcing a complete repositioning of pseudonymisation as a technique. It will remain essential as a security measure, not least because it continues to reduce the attack surface. However, it can no longer, by itself, constitute a sufficient argument for economic operators seeking to move outside the GDPR’s scope. In an AI era where the cost of re-identification has fallen significantly and technology enables it with relative ease, regulatory focus is shifting towards other privacy-enhancing mechanisms, namely PETs — Privacy Enhancing Technologies — which render re-identification, at least for now, “reasonably unlikely”.

PETs appear to provide the technical answer for securing the desired legal shield under the GDPR because, rather than merely masking identity through the substitution of identifiers, they address the issue at its root by ensuring that identity cannot be reconstructed, even by those with significant computational resources and access to external data sources.

The Dublin panel was clear regarding the criteria that determine whether a PET is sufficient to serve as a benchmark for anonymity. The answer is positive if: (i) re-identification would require prohibitively expensive computational effort; (ii) re-identification is practically impossible; and (iii) the system demonstrates robustness against future inference attacks.

It is this final criterion where the tension becomes structural. Robustness is not only technically impossible to guarantee indefinitely without contingencies, but the technological future has never been more difficult to predict. Tomorrow’s models will be more capable than today’s, and no one knows precisely where they are heading. The short history of AI has already shown that there have never been — and still are not — solutions without challenges.

For a time, it was believed that training models without direct access to data, or adding noise to make individual extraction impossible, would suffice. These mechanisms were frequently presented as solutions to the re-identification problem in AI contexts.

Today we know they did not deliver on their promises.

Large language models trained using differential privacy revealed an unresolved tension between privacy guarantees provided through noise injection and the model’s utility. The epsilon parameter (which measures the degree of protection) still lacks a regulatorily accepted benchmark. We know, however, that a low epsilon provides strong protection but degrades model performance, whereas a high epsilon preserves utility while offering only marginal protection.

Federated learning, meanwhile, has proved vulnerable to attacks capable of reconstructing training data from gradients shared during the learning process. The model never directly saw the underlying data, yet the gradients it generated contained sufficient information to reconstruct it.

Finally, synthetic data generation faces the problem of probable re-identification through inference. In other words, it may be possible to determine, with significant degrees of confidence, whether a particular individual was included in the original dataset from which the synthetic data was generated.

PETs may therefore represent the correct direction at present for responding to the rigidity of the regulatory requirement of “robustness against future inference attacks”, just as pseudonymisation once enabled data to circulate without straightforward and accessible re-identification. However, the question remains whether the answer to re-identification challenges can ever be purely technical.

The European Data Protection Board (EDPB) is currently finalising updated guidance on pseudonymised data, but technological progress does not respect regulatory timetables. What is computationally robust today may become trivially reversible tomorrow, and what is considered “reasonably unlikely” today may become routine tomorrow.

The promise of PETs rests, in part, on the same logic that ultimately weakened pseudonymisation: namely, that the technical difficulty of a given moment can be equated with a lasting legal guarantee. By creating a false sense of security, PETs may encourage greater data collection and sharing, thereby undermining the very principles of data minimisation promoted by the GDPR on the one hand, while simultaneously contributing to increased behavioural risk in the digital environment on the other.

If economic operators adopt PETs with the same mindset through which they previously instrumentalised pseudonymisation, the cycle will repeat itself — only with more sophisticated terminology and a shorter period of regulatory grace.

Putting everything into perspective, what Dublin made unmistakably clear is that the European regulator no longer accepts compliance as a question of drawing a perimeter around decryption keys, shared gradients, or synthetic datasets. The perimeter follows the data and will continue to follow the data regardless of the technique used to disguise it.

The joint position of the European Data Protection Board and the European Data Protection Supervisor in the context of the Digital Omnibus initiative is that the definition of personal data should describe what personal data *is*, rather than what it ceases to be as a result of a particular technique or processing architecture.

The regulatory ecosystem appears to require a change in mindset: privacy is no longer an obstacle to be circumvented, but an attribute to be built into systems by design and by default. The relevant question is no longer, “How can I move outside the GDPR?”, but rather, “How can I ensure that individuals’ identities cannot be reconstructed through the business processes I use?”

The phrase “Your privacy is very important to us” — found in so many privacy notices across all sectors of the economy — may now have acquired a new level of substance and significance.

Final Note

This article was written in collaboration with two artificial intelligence systems — Legora and Claude (Anthropic, Sonnet 4.6).

The irony was not lost on any of us: an article about the limits of pseudonymisation and the risks posed by AI models was itself produced with the assistance of two AI models, through a process that generated interaction data which, somewhere along the line, will probably be pseudonymised.

_________________________________________________________________

¹ EUR 5 million.
² IQVIA OPERATIONS FRANCE operated two data warehouses authorised by CNIL itself: (i) LRX, supplied by approximately 14,000 pharmacies, and (ii) EMR, supplied by several thousand physicians. The data, comprising large volumes of diagnoses, symptoms, allergies, prescriptions and sick-leave records, were linked to a unique identifier for each patient, enabling the tracking of each individual’s healthcare pathway. IQVIA, which had previously positioned itself as the data controller and had even obtained CNIL authorisation to process such data, was subsequently challenged under the GDPR regarding compliance with its information obligations towards data subjects (among other requirements). In response, it advanced the argument that the data had become anonymous (by virtue of pseudonymisation) and that, accordingly, the GDPR no longer applied to them.
³ See the line of reasoning adopted by the CJEU in the SRB judgment (Case C-413/23, September 2025), which, although concerning Regulation (EU) 2018/1725 applicable to EU institutions, was expressly recognised by the Court as having equivalent interpretative value. By analogy, it is therefore relevant to controllers and processors in the private sector, particularly for the purposes of Article 13(1)(e) GDPR, concerning the obligation to inform data subjects about the recipients of their personal data.
⁴ According to CNIL’s own records.

News

29.07.2026

Martim Krupenski highlights the strategic importance of the US market at AmCham Portugal’s 75th Anniversary

29.07.2026

Carina Branco highlights the strategic role of technology in an interview with Advocatus

27.07.2026

16.06.2026

Carina Branco authors the article "Data Pseudonymisation in the AI Era"

News

Martim Krupenski highlights the strategic importance of the US market at AmCham Portugal’s 75th Anniversary

Carina Branco highlights the strategic role of technology in an interview with Advocatus

Martim Krupenski featured on the "O CEO é o limite" podcast

Patrons of the Serralves Foundation's National Contemporary Art Collection