Why Data Localisation Might Be Hurting Your Privacy - A Privacy Engineers perspective

Data Localisation: A Privacy Engineer's Nightmare
Right, let's talk about data. Not just any data, but that juicy, personal data that everyone seems to want a piece of. Governments want to control it, companies want to use it, and privacy engineers? Well, we're stuck in the middle, trying to make sure everything's secure and compliant while not completely ruining the user experience. Stuck between, laws ⚖️, lawyers interpresation of the law and the reality of the laws of physics!
In reality It’s a tangled mess of legal complexity, technical inefficiencies, and unintended security risks. As privacy engineers, we find ourselves juggling compliance, security, and business needs, all while trying not to make the user experience unbearable.
This post dives into the real challenges of data localisation, residency, and sovereignty, unpacking how CDNs, hyperscale cloud computing, geopolitics, and privacy-enhancing technologies (PETs) all play a role. We’ll explore why data access matters as much as data storage, how politics and national pride shape data laws, and why blindly enforcing localisation might actually weaken security rather than strengthen it. Most importantly, we’ll discuss practical solutions—from encryption and tokenization to standard contractual clauses and federated identity systems—that can help privacy engineers navigate this chaos while keeping data safe and compliant.
What's the Diff? Localisation vs. Residency vs. Sovereignty
First, let's get our terms straight, because these words are often thrown around like like popcorn at a movie theater 🍿, with everyone vaguely understanding what's going on but not really paying attention to the details.
- Data Localisation: This is where the data must be stored and processed within a specific country's borders. Think of it as building a digital fence around the data, saying, "You shall not pass!".
- Data Residency: Similar to localisation, but sometimes a bit more flexible. It means data should be stored within a country, but processing might happen elsewhere under certain conditions. It's like saying, "You can visit other countries, data, but your primary residence is here."
- Data Sovereignty: This is the big one. It's about a nation's right to govern the data of its citizens and inhabitants, no matter where it's stored. It’s the digital equivalent of national pride and the right to make your own rules. InCountry has a good overview of most countries with Data Sovereignty law..... many.
Now I am no lawyer, but although some countries might have Data localization laws, it is important to look at the details as they differ and we can not polarize this area too quickly. For example, the French🥐 Health Data Host (HDS) certification framework mandates that any service provider hosting personal health data must store this data exclusively within the European Economic Area (EEA). This requirement applies to both the primary host and any subcontractors involved in the storage of such data. Additionally, if the service involves remote access from outside the EEA, it must be based on an adequacy decision (lastest list of adequate countries) by the European Commission or other appropriate safeguards. The host is also obligated to inform clients about the data storage locations and disclose any non-EEA regulations that might require unauthorized access to the personal health data.
In addition, if this is not possible, for example how many have outsources to india IT operations? Well for this we have the following tools 🔨 to help with this:
- Standard Contractual Clauses (SCCs): The most common method for EEA-India data transfers and requires your Indian service provider to contractually commit to GDPR-level protections
- Binding Corporate Rules (BCRs): Used by multinational corporations for intra-group data transfers and requires approval from EU data protection authorities. There are several derogation (Article 49 of GDPR) so watch out for this one.
- Data Localization Approach: Instead of transferring data to India, keep it in the EEA and allow controlled access from India (e.g., via secure VPNs or cloud-based remote work environments).
Long story short, read the long text and don't stop at the headline. I am sure we have all been in a meeting discussing if we can do something or not and there is always someone that says... not allowed 🚫, the data can not leave the country! Well there are save ways... you just need to read the law...yes privacy engineers... read the law!
The Hyperscale Headache: CDNs, Resiliency, and the Need for Speed
Now, here's where things get tricky. We live in a world of hyperscale computing, where data is replicated across the globe to provide resiliency and speedy access (see the image of this post).
- Content Delivery Networks (CDNs): These are essential for delivering cat videos and important information quickly, by caching data on servers closer to the user.
- A Web Application Firewall (WAF) : This is a security layer designed to filter, monitor, and block malicious traffic targeting web applications. Cloud providers typically distribute WAFs across their global infrastructure, integrating them with Content Delivery Networks (CDNs) to enhance both security and performance. A key challenge with decentralized security enforcement is inconsistency in policy application. If IP blocklists and security rules must be maintained separately across multiple regions (e.g., 35+ locations), it creates operational complexity, increases costs, and leads to gaps in protection. A centrally managed, globally distributed WAF ensures uniform security policies, reducing administrative overhead and enhancing bot protection, threat mitigation, and customer experience. By automating policy enforcement across regions via cloud-native WAF services, organizations achieve stronger, more consistent security while maintaining agility and cost-efficiency.
- Disaster Recovery: Replicating data across multiple regions ensures that if one data centre goes poof, everything keeps running. Phil Venables, in his blog post, "Stressed Testing: Practical Operational Resilience", leaves clear that resiliciency is about a business service oriented view of resilience - as opposed to a business function (department / process) or IT-centric view of resilience.
But, how do you reconcile this need for global distribution with data localisation laws? It's like trying to fit a square🟩 peg in a round 🔴 hole, or trying to explain to your grandma what a blockchain is.
Geopolitics and Data: A Thorny Relationship
Ah, politics. Never far from anything complex. National politics and changing international relations dramatically influence data strategies. Data is the new oil🛢️?
While Nationalism might be a tendency ,where countries might want their citizens' data stored locally out of a sense of national pride or a desire to keep an eye on things, sometimes there are Economic Incentives that drive the creation of data centres which could mean jobs and investment. Some countries use localisation laws to attract these, hoping to become the next Silicon Valley, but the counter is true. A study of six developing countries and the EU-28 found that data localisation regulations could reduce GDP by up to 1.7 %, investments up to 4.2 %, and exports by 1.7 %.
If that was not enough, Weaponization🔫of privacy is a another byproduct of geopolitical tensions where data can become a bargaining chip. Restricting data flows can be a way to exert pressure on other nations.
Finally, The "Trumplet" Effect 🎺 (me trying to bypass AI censorship ;-), is a reality, with changing political landscapes, trade wars looming, international agreements on data flows can be thrown into chaos. Remember the whole "safe harbour" agreement debacle?. NOYB released an interesting articule, on how the US Cloud could soon be illegal as Trumplets punch the 🥊first hole in EU-US Data Deal or the UK orders Apple 🍎 to open up access to users data. Its getting nasty, and non of this help the people on the ground with their privacy in my view.
The Illusion of Security: Why Localisation Isn't a Silver Bullet
Here's a harsh truth: data localisation doesn't automatically make data more secure. Think of it like this: putting your valuables in a safe doesn't matter if the safe is made of cardboard and the key is under the doormat. Are you really going to run this bettter than the CSP? are you really going to be able to drive a consistant pattern in 35 locations, with 35 separate teams (remeber this is where the data is accessed from)? Really. Yes you could have a bunch or smart folks in your engineering team and automate it, but automations break, the api's they call change and it all needs to be maintained. Be realistic, it is just not a sustainbility security strategy. Maybe it gives you a warm feeling, but business does not run on warm feelings!
Today's Cloud Service Providers offer dynamic security practices at scale that dramatically improve every customer’s security posture, given that best-of-breed security is mandatory for a successful hyperscale CSP and security must be fully integrated into the design, development, and operations of hyperscale cloud services. Note there is still a lot in the shared responsibility model that the customer of the CSP needs to do to ensure they keep there consumers data secure and private, but you want to leverage as much as you can of the heavy lifting.
Here are some reasons why data localization, sometimes it can make things worse:
- Concentrated Risk: Storing all data in one location makes it a bigger target for cyberattacks.
- Smaller Providers: Local providers might not have the same security expertise or resources as hyperscale cloud providers.
- Insider Threats: The physical location of data does not significantly impact the core security objectives of confidentiality, integrity, and availability.
These are just some, I have used several examples in this post already of why this is just not feasible or sustainable. AWS have a great blog post on this topic here.
Accessibility Matters: The GDPR Perspective
But here's the rub: even if you do manage to store your data within a specific country, you're not necessarily in the clear. GDPR, for example, focuses not just on where the data is physically located but where it is accessible from. Article 44 of the GDPR prohibits the transfer of personal data to third countries unless specific conditions are met, ensuring that the level of protection of individuals guaranteed by the GDPR is not undermined. Additionally, Article 46 outlines appropriate safeguards for such transfers, including standard data protection clauses or binding corporate rules. Therefore, even if data is physically stored within the EEA for example, accessing it from a non-EEA country like India (most outsourcing) is subject to GDPR's data transfer provisions, necessitating appropriate safeguards to protect the data during and after the transfer.
Think about it:
- Outsourced Operations: How many companies have outsourced their operations to other countries? If your system is in the UK but the support team is in India, is the data really in the UK?
- Global Access: If employees or contractors in other countries can access the data remotely, does localisation even matter?
- Mutual Legal Assistance Treaties (MLATs): These agreements allow governments to request data from companies even if it's stored in another country.
Predictability, Manageability, Disassociability, and the Multi-IDP Mess
Privacy isn't just about security; it's also about control. NIST in its Paper NISTIR 8062 "An Introduction to Privacy Engineering and Risk Management in Federal Systems" call out the following Privacy Engineering Objectives - to help system engineers focus on the types of capabilities a system needs in order to demonstrate how the privacy policies and system privacy requirements have been truley implemented.
- Predictability: ensures that individuals and system operators have a reliable understanding of how personally identifiable information (PII) is handled, minimizing surprises and maintaining trust through transparency, accountability, and consistent privacy practices.
- Manageability: is the ability to administer, update, and enforce privacy controls over personal data, ensuring accuracy, minimization, and compliance while enabling appropriate oversight and governance of information handling. For me this is one of the key elements to consider. If something is hard to manage, it is hard to secure and then things just are not going to end well. I could could many laws, like Gall Law, or Le Chatelier’s Principle in systems theory or even the Second Law of Thermodynamics, which while a physical law,it metaphorically applies to complex systems where as complexity increases, disorder (entropy) also tends to increase unless actively managed. Long a short of it is that when a system becomes more intricate, unforeseen disruptions and inefficiencies increase. Fact.
- Disassociability: refers to a key aspect of privacy-preserving systems, ensuring that a person's identity or activities remain hidden from unnecessary exposure. Unlike confidentiality, which focuses on preventing unauthorized access, disassociability acknowledges that privacy risks can arise even within trusted environments. It enhances privacy by prompting system designers and engineers to identify and eliminate unnecessary points of exposure, ensuring that only essential data is linked to individuals. This principle closely aligns with data minimization, as it encourages reducing the collection and retention of personally identifiable information whenever possible.
Now lets take the example of a company trying to run a IDP for it consumer and customers across 35 countries. Imagine you're the privacy engineer trying to uphold these principles in this world where multiple Identity Providers (IDPs), each with its own rules and regulations, spread across many countries. It's a nightmare! This will lead to inter alia:
- Inconsistent Policies: Each IDP might have different privacy policies, making it hard to ensure consistent protection.
- Federation Headaches: Federating these IDPs across multiple countries? You're just asking for trouble. Different legal standards for law-enforcement access, data retention, data security, censorship, and other data-related requirements can put a firm into a legal catch-22.
- Lack of Visibility: It's hard to keep track of where data is, who has access, and what they're doing with it.
- Increased Strain on Technology: Many institutions have difficulty identifying all systems that store sensitive information and deploying controls for them.
Privacy Theatre or Genuine Concern?
So, is all this data localisation just "privacy theatre"? Are we weaponizing people's rights for other reasons that care very little about this fundamental right? It's a valid question.
- Economic Protectionism: Could data localisation be a way for countries to protect their local industries and create trade barriers?
- Government Surveillance: Does keeping data within borders make it easier for governments to spy on their own citizens?
- Geopolitical Power Plays: Is data sovereignty just a new form of nationalism, a way for countries to assert their power in the digital world?
PETs to the Rescue? Privacy-Enhancing Technologies to the Rescue
So, is there any hope? Well, maybe. Privacy-Enhancing Technologies (PETs) offer a glimmer of light in this otherwise gloomy landscape.
Here are some PETs that might be useful to consider:
- Data Minimisation: This is the most basic, micky mouse one. "Can we achieve the same with less privacy-invasive methods, or while processing less personal data?" Only collect and retain what you absolutely need. If you don't need it, don't store it. - This is the cheapest and best PET of them all, but requires people changing there ways.... hard is it not?
- Encryption: Encrypting data with keys that only you control means you can store it anywhere, and it's useless to anyone without the key. This involves Bring Your Own Key" (BYOK) or commonly referred to as Customer-Managed Keys (CMK) or Customer-Supplied Encryption Keys and rendering data useless if you do not have the keys so you can host anywhere. All CSP have flavours of it. AWS with Customer-Managed Keys (CMK) in AWS KMS, Microsoft Azure calls them Customer Managed Keys in Azure Key Vault or Google Cloud Platform (GCP) with Customer-Supplied Encryption Keys (CSEK).
- Tokenization: Replacing sensitive data with non-sensitive tokens allows you to process data without exposing the real stuff.
- Data anonymisation the encryption of data is considered an anonymization technique.
- Homomorphic Encryption: Performing computations on encrypted data without decrypting it. If I were you, I would focus on the simple solution to complex problems first.
The Way Forward: Pragmatism, Collaboration, and a Bit of Luck
Navigating the world of data localisation is a challenge, but not an insurmountable one. Here's my advice for privacy engineers:
- Embrace Pragmatism: You can't fight every battle. Focus on the most critical data and the most stringent regulations.
- Read the Law: do not stop at the headline. The law many a times is flexible given controls are in place.
- Advocate for Standards: Push for international standards and agreements on data privacy and security.
- PETs: These technologies are the future of privacy but doe not start with the fancy ones first. Start with the simple ones!
- Collaborate: Work with legal, security, and business teams to find solutions that work for everyone. Keep talking!
- Keep Learning: The landscape is constantly changing. Stay informed, stay agile, and try to maintain a sense of humour.
Final Thoughts
Data localisation, residency, and sovereignty are complex issues with no easy answers. But by understanding the challenges, embracing new technologies, and working together, we can build a future where data is both secure and accessible, where innovation thrives, and where privacy is respected.