Perversely patient-centric? How data brokers figure out you’re pregnant

Privacy experts say data aggregators should think twice about the predatory practices their data could enable.

Stock image of a pregnant woman using a computer
The worry is not only that data can be harnessed by prosecutors in states where abortion is illegal. (Photo credit: Getty Images).

Companies in the data-brokering industry have been amassing and selling troves of information on pregnant people for years. These adtech actors, unsurprisingly, say they’re on the right side of the law. But with the potential to abuse health data higher than ever in the wake of the Dobbs v. Jackson Women’s Health Organization decision, privacy experts argue it may be time to rein in the most perversely patient-centric of their efforts.

Data aggregation is an industry made up of hundreds of small brokers. Among their wares, many of them sell datasets on expectant parents, from their purchase history to expected date of birth. These vendors are building product offerings that, abortion-rights activists warn, are likely to be weaponized by states with abortion bans. 

The data have become a political issue, with various congressional Democrats attempting to coerce brokers to curb the practice. Despite politicians’ vows to question the companies about their practices and the introduction of bills to exempt reproductive health data from what’s allowed to be collected and sold, though, the aggregators themselves are largely standing pat.

“As far as I know, there’s no law today that prohibits prenatal mailing lists,” an executive from NextMark, an online directory which hosts marketing email lists from data brokers, told Politico. “If that were to change and this type of data became illegal, we’d work with the providers to remove these listings.” 

Politico found more than 30 listings from brokers, such as Exact Data and PK List Marketing, offering information on expecting parents or selling access to those people through mass email blasts.

About the legality of these practices, Eric Perakslis, chief science and digital officer for the Duke Clinical Research Institute and professor at Duke School of Medicine said that NextMark was on safe ground. “They’re not wrong about that,” he added.

Nor are the aggregators and brokers solely to blame. Health data typically are sold to them from upstream sources, such as hospitals, tertiary care facilities, labs or MRI centers. These facilities obtain patient data based on having in place a business associate agreement, which doesn’t prevent them from selling it under the Health Insurance Portability and Accountability Act so long as it’s de-identified, Perakslis noted.

“When you look at the provenance or chain of custody of the data, an argument a lot of aggregators make is, ‘If it’s that problematic, why are people selling us this data?’” he explained. “‘Shouldn’t they not be selling it if it’s a problem? They’re the ones that collected it.’”

Oftentimes, they’re not even buying health data per se, but instead inferring pregnancy status through other means. 

“A lot of these data brokers aren’t as sophisticated as you think,” said Mark Kapczynski, SVP of strategic partnerships for OneRep, a privacy protection company. “The reality is, they’re really just hustling.” 

Which is to say: aggregators just want to create a pool of data and sell it as quickly as they can. “They don’t really care if it’s 100% accurate or not. That’s not their business,” Kapczynski continued. “As long as they’re close enough, people will buy it. And so that’s what they do.”

Nor are they necessarily buying personal health information, which is regulated under HIPAA. Rather, they may be securing demographic data from people look-up sites -- think Spokeo, MyLife, Intelius or Instant Checkmate -- or making assumptions about pregnancy status based on publicly available marriage records.

Kapczynski explained how this works: “Every state tracks the records as to when someone should be pregnant, or the typical pregnancy rate in their state. If you look up a state — say, New Mexico — it’s 2.3 years after marriage that someone gives birth. So now I don’t actually need to know if your wife’s pregnant or not. I don’t need to fact-check it. I just need to know when you guys get married.”

As a result, brokers can model when people “should be” pregnant, Kapczynski added. “So in fact, you don’t even need to be accurate. You just need to be close.”

Aggregators can also buy marketing data that identifies buying patterns and use it to infer pregnancy status. This played out in the now-infamous “Target episode,” in which the retailer pinpointed through analyzing its own transaction records that a teenager was pregnant and then started messaging to her.

Data brokers, for their part, contend that what they’re providing is a “beneficial resource” for pregnant people — connecting new parents with discounts on staples like diapers and baby formula. But while this information may be useful to marketers, aggregating and triangulating many different data points on parents can have legal consequences.

Take the case in which text messages and search histories were used by law enforcement to enforce abortion laws. Or the one in which digital evidence was leveraged to secure a feticide conviction against a woman for illegally inducing her own abortion.

The worry is not only that data can be harnessed by prosecutors in states where abortion is illegal to identify people who terminate their pregnancies. Prosecutors could subpoena data on pregnant people in the state and combine it with location data from a different data broker to determine that a person traveled across state lines to an abortion clinic.

Some of the email list purveyors say they wouldn’t allow their lists to be used for campaigns for or against abortion rights. Perakslis pushes back on the notion that the brokers retain absolute control over their own databases. 

He pointed to “the mass of what’s being created and the amount of data that can end up on the dark web. You build something really big and it becomes really attractive to state actors, versus simple criminals. And the data is so untraceable, I don’t even know if these people would know they were hacked.”

From a cyber-risk perspective, aggregated consumer databases can just as easily be used by nefarious actors as they can be used for good. Data aggregators “are not returning any value for their data, but they’re creating risk for people,” Perakslis observed. So even though it isn’t strictly illegal, amassing and trading in pregnancy data create concerns from an ethical perspective. 

Lawmakers have talked for years about passing some form of data privacy regulation, and two recent proposals would specifically block collection of health-related data. Sen. Ron Wyden introduced the My Body, My Data Act of 2022 to limit reproductive health data collection. A bipartisan bill, The American Data Privacy and Protection Act, passed out of committee on July 20.  But the bills have little backing from Democrats and “aren’t expected to gain wider support,” Politico reported. 

Placer.ai, a location data firm, once offered data visualizations showing where visitors to Planned Parenthood facilities live. But the firm stopped offering data on abortion clinics after an article in Vice uncovered its practices, as did data broker SafeGraph. Data behemoth Experian did the same in 2016. 

But these examples are not the norm and other data brokers say they’re not about to change their practices willingly. In the meantime, privacy experts say data aggregators should think twice about the predatory practices their data could enable.

Perakslis stressed the need to consider the current backdrop of the information and culture wars. Just as COVID-19 unleashed an infodemic of false or misleading information, the Dobbs decision freed states to pass laws that threaten to go beyond typical law enforcement in getting people to comply with their anti-abortion agenda.

The possible harms of aggregated data include re-identification, which could lead to embarrassment, harassment and prosecution, as well as to selecting individuals and groups for deeper surveillance via stalkerware. Re-identification can also result in financial and reputational damages to individuals, communities or geographies based upon geotagged linked data. 

“This issue of weaponization of data is not the sole domain of one political party or another or one stance or another,” Kapczynski said. “The polarization has gotten so great that, no matter what your stance is on something, the person who’s opposing you now wants to use data to their advantage.”

He noted that one of OneRep’s healthcare clients, a large hospital chain, signed with the company because employees were being harassed by patients. “These staffers were being tracked down at home because, you know, ‘Don’t give so-and-so the vaccine,’ or ‘Do give them the vaccine, why didn’t you give it to them?’”

Indeed, misuses of intimate personal data can stem from a practice that is not technically illegal. 

“When we know up front there’s a problem, the legality of something is not a good excuse not to act on it,” noted Perakslis. “There are always going to be the unknown-unknowns in cybersecurity. The data aggregators are not paying attention to the known-unknowns or even the known-knowns of why harm could happen.”

This story first appeared on mmm-online.com. 


Have you registered with us yet?

Register now to enjoy more articles and free email bulletins

Register
Already registered?
Sign in