Data breaches, data leaks, data brokers, data data data. It's repeated over and over in the media, but what the hell even is "data"? What can the recipient of that data do with it? Is it a bad thing? Should you be worried about it? That's what this month's column is all about: the data industrial complex and how it can be used against you by ratbag companies.
What is data anyways?
In this context "data" is information you give to the organisation running a service you use, either directly or indirectly. Stuff like your email address, a password, your credit card, your phone number, a drivers licence – that's all direct data. You gave them that information because they need it in order to provide you with a service, free or paid.
Then there's data you don't hand over on purpose, called metadata. It's information that's spewed out of your device during normal activity because computers are snitches, and companies like to hoard that chatter because they think it'll make them more money.
The amount of data a website or app can collect without you knowing is huge, but some common examples include:
- Time & date
- IP address
- Location coordinates
- Device type (e.g: iOS/Android/Mac/Windows)
- Websites or apps you use before and after
Apple has a list of all the data an app from the App Store can collect and it's pretty wild to see it laid out in a table. Google, Microsoft, Meta and other platforms all do the same, if not more, of this type of data collection, but are less transparent than Apple about it.
What can someone do with my data?
On its own, data and metadata doesn't look so bad. Who cares if a business knows your IP address or the websites you visit? But when you combine all this stuff it gives them an amazing overview of who you are. It's literally the business models of some of the world's richest companies, like Meta and Google.
Most people probably think that's a bit gross, but whatever, it's advertising and hey, I get free email and social media out of it. I've got nothing to hide! Well, it all goes to shit when the government and cops get involved.
These huge repositories of data become pots of gold for police as they can fire off subpoenas to these companies, or in the most dystopian of cases, get direct access via software like Palantir or Auror to view all the data on a suspect. The cross pollination of data from all of these sources can be built into a mega dossier on people, all at the fingertips of governments and law enforcement and ready to be abused and taken out of context.
Then there's the whole shitshow of a data-hoarding company experiencing a data breach. Hackers often leak it all on the dark web, where it's merged with data from other breaches to make a huge database perfect for identity theft and blackmail.
Take this unfortunately common hypothetical scenario – Company A got hacked in 2021, and hackers got your name, phone, email address, and drivers licence. Company B got hacked in 2023, and hackers got your name, phone, email address, residential address, and passport number.
Because both sets of data are available to buy on the dark web, a fraudster can match the name, phone, and email address, then combine that with the passport details, drivers licence, and residential address. That's enough personal information to do some serious identity theft that one breach alone wouldn't have enough data to enable.
There's a whole market of data wholesaling done by companies calling themselves data brokers. They buy data from every conceivable source – apps, websites, banks, shops, car infotainment systems, watches, and more – bundle it all up, and sell the lot to whoever wants it – and it's totally legal.
One famous customer of data brokers are the USA's intelligence agencies. They love using this publicly available data instead of doing their own nefarious spying – that's how thorough the data collected by businesses is these days. The EFF’s excellent Surveillance Self-Defence guide has a section on metadata that explains how groups like governments and intelligence agencies can use metadata to paint a picture of someone’s life without them knowing.
Can't leak data that doesn’t exist
A common analogy in cybersecurity circles is comparing data to toxic waste. It can sit for years, causing no harm until one day you realise it's leaked all over the place and is ruining the environment. Toxic waste is also expensive to handle properly, so it ends up being carelessly dumped to save money.
Applying that logic to data means we simply shouldn't be collecting that data in the first place – when they’re done with the data, it is deleted. You can't have a data leak if there's no data to leak! If the minimal amount of data a company does have manages to escape (either legally or illegally), then at least the scope of that data leak is small.
In practical terms that means companies should only collect the bare minimum needed to do their job and have policies to delete data that's no longer necessary for them to do their thing. Despite sounding like common sense, the concept of data minimisation is relatively new and many businesses simply don't think about the consequences of their data collection activities – or worse, that data collection is how they make money.
Can I do anything about it?
We mere cogs in the machine are in an awkward position, relying on the awareness of businesses to do something that's not the status quo and possibly against their own interests.
While there are some technology-based solutions, like using an ad-blocker, VPNs to disguise your location, or meticulously going through your device's settings turning off features, they're not foolproof and can exclude you from mainstream services everyone else takes for granted. There's no ethical consumption under capitalism, remember?
The best thing you can do is be aware of the practice of excessive data collection and data minimisation, choose businesses that try to reduce the data they collect and store, and lobby for stronger laws to end, or at least reduce, the impact of the entire data collection industrial complex.
- Electronic Frontier Foundation (USA)
- Digital Rights Watch (Australia)
- European Digital Rights (EU)
- Open Rights Group (UK)
Got a tech question for Ada? She wants to hear from you!
Ada answers all your questions about tech, the online world, and staying safe in it. No question is too silly, no hypothetical is too far-fetched! Learn to leverage devices, systems, and platforms to your benefit.