Minimization and contamination

Today is my birthday, and that's because from now on, I no longer identify with my biological age of ██ but with my social-emotional age of 19.
Updated

I did notice that my social-emotional age varies from day to day, so I have my birthday more regularly. Tomorrow afternoon, I will probably also move to Moeraswantsstraat 25, just down the road.

The reason for this is a sheet of A4 paper I received at the beginning of some course. On it were all the names, addresses, phone numbers, email addresses, and birthdates of myself and my fellow students in the classroom. Identity data that ended up on that sheet through the infinite internet universe because we all neatly and truthfully filled in all the fields on the web form during registration. I have since destroyed the paper sheet, but there are still ten in circulation, apart from various invisible digital copies in databases, emails, and files. That's the annoying thing about a data breach: you can't fight it afterwards, only beforehand.

I doubt whether this educational institute will send me a bottle of champagne on my biological birthday (gift tip: Bollinger Special Cuvée Brut) and inform me of this joyful event by phone. In short, there is probably no need to ask for all this information, because, even though I keep a sharp eye on the PostNL delivery van, I haven't seen any bottle yet. A data breach should be fought in advance, and in that category, data minimization is the most effective measure. In other words, collect as little data as possible, because if you collect less data, you have less data to secure, and as an organization, you can inherently leak less data. Data minimization is also one of the basic principles in the General Data Protection Regulation (GDPR), which has been in effect since 2018.

You would think that collecting fewer data would also require less effort, but that turns out not to be the case in practice. Paradoxically, it is actually easier to collect more data because then you can't regret it later. Data maximization is the daily practice, a kind of implicit standard, while data minimization should be the standard. After all, data of others are just data of others, so various matters such as storage limitation and ensuring confidentiality and integrity are expensive and complicated, where complicated is another word for expensive, but in the cognitive domain. In short, wherever cost considerations in money and brain capacity stand in the way of desired behavior, legislation arises after a while to restore the balance to the detriment of undesirable behavior. Thus, a data breach can cost an organization up to 4% of its annual turnover, a daunting fine of considerable size.

Although, in practice, that rarely happens. For example, in the 2016 annual report of the Intercontinental Hotel Group, there was a brief announcement about a data breach. Annoying. Sorry. You can follow the handling of it in slow motion in every annual report up to and including 2019. In a small intermediate sentence, they almost proudly mentioned that not a single person booked one room less as a result of the data breach. Apparently, no hotel guest cares in a way that financially impacts the reservations. That seems a strange way of saying sorry, but an annual report is a communication tool towards shareholders. That one sentence is meant to pop up a sort of thought bubble above the head of every shareholder with the reassuring text: "our data may not be safe with them, but our pension investments are." Fortunately, there are still regulators who price this kind of digital pollution on behalf of all of us, but here too I read in an annual report that they managed to talk down the initial fine of 100 million pounds considerably. Probably mitigating circumstances, they are also just victims after all. In 2022, this hotel chain reported a new data breach.

The consequences of a data breach can be significant because identity data can facilitate fraud. They can also be used by criminals to build trust, for example, by calling people on their mobile phones while they are cooking: “Good evening, madam, this is The Dutch National Bank speaking. We have detected that your bank account has been hacked, and my colleague from the fraud department will assist you shortly to safely transfer your money to our vault account. I will immediately set up an appointment for you at three o’clock tomorrow afternoon at the ING office nearby for further handling. You live at Zwitserlandweg 81, right? Great, then I will look up the nearest office while I connect you to my colleague at the fraud desk, as there is some urgency.”

This colleague from the so-called fraud desk didn't waste any time. Within two minutes, he had my elderly mother search, download, and install the computer program AnyDesk via Google, thereby gaining access to her computer. A big compliment to AnyDesk is in order here, as their customer journey is bizarrely well-optimized. The website and software of AnyDesk make it possible to onboard any digital novice with any laptop at any time within two minutes. Indeed, within two minutes, you are on, pun intended, any desk. Anyway, when my mother had to explain to a surprised bank clerk at the counter the next afternoon that she was there for 'the vault account,' which didn't exist at all, she realized that her money was gone. She only dared to call me at eight o'clock in the evening.

How did these criminals get her name, address, and mobile phone number? Did she become a victim of her own honesty on the order form of, for example, Allekabels.nl, which at that time was not yet known as the largest Dutch data breach ever? That data breach also contained birthdates, which criminals can use to target older victims. Moreover, postal codes provide a wealth of information that becomes even more sparkling when enriched with a large database from the CBS (Statistics Netherlands) website, where all socio-economic indicators are detailed. With this information, criminals can target postal codes with many owner-occupied houses and few welfare recipients, or other indicators of affluence. All this data and possibilities combined can result in a very profitable criminal telemarketing campaign in a very short time.

Should we all start lying defensively on websites that for unclear reasons want various details from us? We could make defensive lying acceptable by calling it something else, but perhaps it's better to look at it differently. We don't necessarily have to see data and information as synonyms, but as separate abstractions, similar to sex and gender: sex is a biological construct, and gender is a social one. In a similar way, we can view information as a personal construct and data as a public one: in the first shell around 'me' are 'information,' personal therefore, and around that is a second shell with 'data,' intended for public consumption.

Since we now view information and data as two separate concepts, we can also treat them separately. From the data-collecting side, we can expect little other than leaks. However, we do have control over what we share, and this somewhat academic distinction between information and data enables us to take measures on our side. Let's start with the first shell: information. For this, we can reuse the principle of data minimization. The logic is similar: what you don't share doesn't need to be protected and therefore can't leak. Unfortunately, collectors obligate us to fill in various details, and data minimization is of no use: we must share something. Here we transition from information to data, the second shell, because why should we share our real details on the hundreds of data-maximizing websites? If you are then supposed to share something, the principle of data contamination comes into play. You don't share your (real) information but (made-up) data, creating an actual separation between personal and public. In other words: minimize where possible (personal shell), contaminate where necessary (public shell).

Therefore, I am moving to Moeraswantsstraat tomorrow because I plan to order something online, but this street doesn't actually exist, which is indeed inconvenient for deliveries. In any case, Moeraswantsstraat works excellently on sites that have no idea how to perform automatic postcode-street-number checks and pass through all input unverified. If Moeraswantsstraat is rejected, that actually says something positive about the data management of the respective organization. In this way, the savvy cyber citizen actually gathers information about potentially sloppy data collectors. For organizations that are about to send goods, I must fall back on the Bumblebee delivery boxes at the local Gamma on Houtpantserjufferlaan. Thus, the new abstraction between information and data, in addition to extra hygiene, also provides extra exercise. Then there's my phone number. That's tricky. Do I give a non-existent number, but what if they send a text message with the delivery time, or if the bank sends those handy text messages saying my card is about to expire and I have 3 days to enter my old PIN on the linked website, choose a new PIN, and send the old card, quickly, before I can't pay anymore? Anyway, perhaps I can equip grandpa's chewed-up hand-me-down phone with the cheapest data-less mobile plan and place it in the living room, making this mobile phone the visible reminder in our living room of the distinction between private information and public data.

Despite years of working in cybersecurity, reading cynical annual reports, or even after the experience of my mother being cyber-robbed, the necessity of minimizing and contaminating didn't really hit home for me, ironically enough. At least, the necessity did, but not in a way that spurred me into action. It was one of those topics that still floated in the wrong quadrant of a matrix with important/not important on one axis and urgent/not urgent on the other. No, it was only upon seeing a paper A4 sheet with my identity details and those of my fellow students that the feedback loop between providing and collecting was short enough to activate my reptilian brain. It was enough to move the topic to the important-and-urgent quadrant, and I metaphorically moved along with it, to my new phantom address, spurred on by a paper data breach, something all the virtual and digital breaches before had never managed to do.