Data Brokers Part 1: Unregulated Forensic Analysis — No One's Happy

29 min read Original article ↗

You have a profile. It sits in a commercial database operated by a company you have likely never heard of, available to any business with a cloud subscription — which is, by now, every sizable business.

You may have felt some of this already. The airfare that rises between your first search and your second. The delivery apps, Instacart drama, and now sports tickets. There is a growing roster of companies that charge different people different amounts for the identical item, depending on who is asking. If you’re not interested in the mechanics of data brokers, skip to the second part of this series titled “Surveillance Pricing”.

You already know some of the fields on a common personal data profile. You know the credit bureaus have your loans and the DMV has your tickets. But what you may not know is that the problem is no longer just how much they hold — it is how freely it moves, how accurately it’s combined, and how little law stands between it and the decisions made about you.

Acxiom, just one such data broker, holds files on roughly 260 million Americans. It organizes them into 162 million households and maintains about 1,500 core attributes on each one, drawn from a catalog of more than ten thousand. [1] Acxiom is a subsidiary of Omnicom, the advertising holding company. Their catalog is not hidden; an older edition sits on the Internet Archive [2] and I urge you to glance at it. When the Senate investigated the industry in 2013, Acxiom held about 3,000 “propensities” per consumer; the catalog now exceeds ten thousand attributes. [3]

Acxiom’s catalog reads less like a marketing database and more like a shadow census of American life combined with a financial dossier and a predictive surveillance system. The company tracks life events such as births, marriages, divorces, and home purchases; family-network attributes such as “Potential Inheritor” and “Adult with Wealthy Parent”; health indicators including diabetes, cholesterol, disability, and prescription-drug behavior; and detailed financial estimates covering income, net worth, home equity, mortgages, lender relationships, and borrowing capacity. It doesn’t just record who people are — it attempts to predict their wealth, health, family connections, and future behavior.

The industry claims that data-driven decisions are fairer than the human judgment they replaced. “Fairer decisions, broader access, and lower costs.” A loan officer can dislike a face, and a model cannot. When Berkeley economists examined the pricing of millions of mortgages, lenders charged Black and Latino borrowers nearly eight basis points more in person, and algorithmic lenders discriminated roughly 40 percent less. [4] When the CFPB monitored Upstart’s underwriting model, the model approved 27 percent more applicants than a traditional scorecard, at 16 percent lower average rates, with the gains spread across every race, ethnicity, and sex segment the Bureau tested. [5]

But notice what the case defends. Six months after the Bureau published Upstart’s results, the Student Borrower Protection Center ran its own test: identical hypothetical applicants, a single field varied — the college attended — and the graduate of Howard University was charged $3,499 more over the life of a five-year loan than the same applicant carrying a degree from NYU. [6] The Bureau asked a portfolio question: across all applicants, were approval rates and prices worse for protected groups than under a traditional scorecard? They were not — the model cleared the bar, and the bar was the old model. The audit asked an applicant’s question: hold everything constant, change only the school, and watch the price. It moved $3,499. Not because Howard costs more — because the model graded each school by the average financial outcomes of its alumni and graded each applicant by their school, and the averages of a historically Black university carry the racial wealth gap inside them.

A model can approve more Black borrowers overall and still surcharge the graduates of historically Black colleges, the penalty dissolving into a favorable mean — and the person paying the surcharge experiences no average. Passing the first test does not absolve the second.

The ecosystem

Acxiom is not an outlier. It is one registrant among many: California’s data broker registry — the only meaningful public census of the industry, because brokers must register there or face penalties — passed 575 companies this February, up from 459 a year earlier, and registration is required only of firms whose business is selling data about people they have no relationship with. [7] The number is a floor: when the Privacy Rights Clearinghouse and the Electronic Frontier Foundation merged all five state registries in 2025, they found about 750 unique broker groups. [8]

Every ad-supported page you load broadcasts your advertising ID, location, and IP address to every company eligible to bid on the ad slot. Your car reports home continuously; when Mozilla reviewed 25 car brands in 2023, all 25 failed on privacy — the first time in the guide’s history an entire category failed — because 84 percent share data with brokers, 76 percent reserve the right to sell it, and the margins on a data feed are pure profit on a car already sold. Nissan’s privacy policy listed the collection of sexual activity and genetic data. [9]

Equifax’s Work Number receives payroll records directly from employers and covers most of the American workforce, while its National Consumer Telecom & Utility Exchange aggregates phone and utility payment behavior into lending models. Most people are covered by both.

All of that arrives as fragments — a record associated to an email here, to an advertising ID there, to a Social Security number somewhere else. The companies that matter most in this industry are the ones that put the fragments together. The mechanism is simple enough: a website you visit runs LiveRamp’s tracking tag; the tag fires while you are logged in, and your hashed email and your browser cookie arrive at LiveRamp together, so LiveRamp records the match. Later your grocery store sends LiveRamp the phone number and email from your loyalty program, and it records that match too. Repeat across hundreds of transactions and billions of records over a decade, and the browser, the email, and the phone number are one person under a single persistent identifier. [10]

Three of these identity graphs — LiveRamp’s, TransUnion’s, and Experian’s — cover, between them, essentially every American adult with a phone or an email address. The precise figures are the vendors’ own: TransUnion’s marketing claims 98 percent of the U.S. population, with a given identifier staying correctly attached to its person 99.5 percent of the time. [11] The infrastructure is valuable enough that the two largest advertising holding companies on earth now own competing versions: Publicis Groupe is acquiring LiveRamp currently for $2.2 billion, while Omnicom, which owns Acxiom, is building a rival graph and plans to exit LiveRamp’s by 2028. [10]

The last piece is distribution, and it no longer looks like anyone’s mental image of a data sale. The profiles sit in ordinary cloud infrastructure: Snowflake’s data marketplace lists more than 3,400 datasets from over 820 providers, and among them is Acxiom itself, whose InfoBase audiences are a listing any Snowflake customer can request like any other software subscription; Experian and Epsilon sell through Amazon’s equivalent. [12] When two companies want to combine what they know, they use what the industry calls a clean room: imagine a sealed room with a clerk inside. An insurer slides its customer list under the door; a broker (or the service) makes its profiles available inside; the clerk matches the two, answers the insurer’s question — a score, a flag, a yes or no — and slides the answer back out. Neither company sees the other’s raw records, nothing is copied, and there is no breach to disclose (as long as the clean-room outputs have not leaked attributes). A clean-room subscription, a contract, and the broker’s approval give a company with a cloud account a queryable profile of every American household.

How the predictions are made

The file is the raw material. The product is the prediction, and the way predictions are manufactured is worth understanding, because it explains both why they work and how they fail.

A scoring model is built by taking millions of historical records — people’s attributes, paired with how things turned out: who paid, who was evicted, who crashed, who died — and fitting a pattern to them. The dominant tool is a decision-tree model: thousands of small branching tests, each asking yes/no questions of the file — more than two addresses in the last three years? utility payment history thinner than average? first credit account opened recently? — with each round of trees correcting the errors of the last, until the model reliably sorts the historical records into good and bad outcomes. Run your file through the finished model and out comes a number. [13]

The technique has a lineage. Fair and Isaac sold the first statistical scorecards in 1958; the logistic-regression scorecard then governed underwriting for half a century, not because nothing better existed but because a regulator could read one. [14] The decision-tree ensembles that displaced it in the 2010s are machine learning in the proper sense — the machine, not an analyst, discovers the branching tests — and they won because they extract more signal from thousands of columns while remaining just explainable enough to defend. The selection criterion in regulated lending has never been the most accurate model; it has been the most accurate model that can survive an adverse-action audit. And that constraint is now dissolving, because the explanation itself has been automated: post-hoc tools generate audit-ready rationales for models no human can read, neural networks trained on raw bank-transaction streams are moving into production, and the vendors’ newest pitch is generative AI that ingests the whole file at once.

Two things follow from this construction. First, the score is not a finding about you. It is a summary of how people who resemble you on paper have behaved. A model can only weigh the columns it was built with, and it weighs them by resemblance, not by your record. And the inputs have exploded. A traditional credit scorecard weighed about 30 attributes. Zest AI, which builds underwriting models for banks, extracts thousands of features per applicant from raw bureau data. Upstart’s lending model evaluates 2,500 variables — including where you went to school and what you studied — and originates 90 percent of its loans fully automated, with no human involvement. [13] Your purchase history existed in 2010; nobody was running a model against it then. The data is old, but the model is new. The decision built on it did not exist as a category five years ago, but it is surging now.

And when the law requires an explanation, the model supplies one — sort of. A denied applicant receives a “reason code”: a one-line category, selected by ranking which columns moved the score most. “Insufficient credit history.” “Income insufficient for amount requested.” The explanation is generated by the same machinery it purports to explain, and federal guidance has had to remind lenders that a sample-form phrase is not a reason. “These complex algorithms sometimes rely on data that are harvested from consumer surveillance or data not typically found in a consumer’s credit file or credit application.” [15]

The decisions

The file does not exist for advertising. It exists to make decisions, and the decisions are automated — you see only the outcome. The deposit the utility company requires before turning on your power; the credit limit that drops after a model rereads your file; whether your résumé reaches a human or is filtered out by a screening vendor; the premium on a life-insurance policy priced from a mortality score built on shopping data; whether a retailer accepts your return or a fraud model quietly declines it; how long you wait on hold, depending on what a lifetime-value score says you are worth. None of these announce themselves as decisions.

Healthcare is where the profile and the most consequential decisions converge. The insurers are data-broker customers. When ProPublica investigated in 2018, insurers were buying race, education, marital status, net worth, and bill-payment histories from brokers and feeding them into models predicting the health costs of individual members; LexisNexis marketed 442 non-medical attributes for the purpose. [16]

Where that data flows is itself instructive, because it maps the regulation in reverse. Broker data does not appear in claim denials — not out of restraint, but because a denial must survive an appeal on medical-necessity grounds, and a denial that cited your shopping history would not. It cannot set your health premium either; the ACA requires insurers to take every applicant and lets them price on only age, geography, family size, and tobacco. So the data flows to the decisions that face no review: which members get flagged for “care management,” who is marketed which plan — and, one market over in life insurance, where individual underwriting is still legal, straight into the decision of whether a company will take the risk of you, and at what price. That is what LexisNexis’s mortality score, built from non-medical data on 50 million Americans, is for. It is marketed as a “non-FCRA underwriting accelerator” — an accelerator because it lets the insurer skip the medical exam and read your file instead, non-FCRA because it is positioned outside the one law (The Fair Credit Reporting Act) that would let you see and dispute what the file says. [17]

Do these quiet models fail? Almost nothing about them can be independently checked — they trigger no adverse-action notices, no audits, no discovery. But the one time academics were allowed inside one — an Optum algorithm used by hospitals to flag patients for extra care — they found it predicting healthcare spending as a proxy for medical need. Black patients spend less — not because they are healthier but because they have less access to care — so at the same risk score they were significantly sicker, and the algorithm cut the care they received by more than half. [18] The system did the discriminating, and these systems generate no denials to appeal and no records to discover — and not having discoverable records is not evidence of innocence.

Unprotected by design

The Fair Credit Reporting Act — passed in 1970 — is the only federal law that gives you the right to see and contest the information used in consequential decisions about you. It applies to “consumer reports” from “consumer reporting agencies,” and every major product in the broker ecosystem argues that it is neither. LexisNexis attaches terms to its non-FCRA products stating they “may not be used as a factor in determining eligibility for credit, insurance, or employment”; whether buyers honor the disclaimer is monitored by no one. [19] The product positioning is the legal strategy. HIPAA covers your doctor — not your apps, your wearable, your data broker, or any health inference drawn from your thousands of data points. The state privacy laws now on the books in more than twenty states give you rights to the data: you can demand deletion of your personal information, but the scoring model trained on it, the derived attribute, and the downstream decision are not covered by the same right. Rights on inputs; none on inferences, so they remain regardless.

The pipeline was engineered so that no single actor bears responsibility. The app collects, disclosed in generic terms of service. The broker aggregates. The scoring vendor draws the inference — not a “consumer report,” so FCRA does not apply. The insurer prices from the score, an approved rating factor. Each stage is legal, and every company in the chain offers the same defense: we provide information; the decision is made by the landlord — or the insurer, the employer, the health plan, your sports ticket vendor, your grocery store, etc. A federal court finally rejected that posture in Mobley v. Workday, ruling that an AI hiring vendor could be liable as an “agent” of the employers it served. [20] One rejection, in one sector.

When Texas filed the first enforcement action under any state comprehensive privacy law — against Allstate’s subsidiary Arity, which embedded an SDK in third-party apps to harvest driving data from millions of consumers and feed it into insurance pricing — the case was dismissed on jurisdiction. A consolidated private class action survived dismissal in March 2026, but the claims are still proceeding; no one has been held accountable yet. [21]

When the regulatory apparatus has tried to act, the action has been reversed. In December 2024, the CFPB proposed a rule that would have reclassified data brokers under FCRA — subjecting them to the same accuracy, dispute, and permissible-purpose requirements as credit bureaus. Five months later, acting director Russell Vought withdrew the rule entirely. [22] The one agency with plausible authority to close the gap chose to widen it.

What opting out costs

Consider what it would take for a person to opt out of this system entirely. They could have no bank accounts; opt out of every data logging program; they could belong to no rewards programs, install almost no apps, block every tracking surface. They could carry the legal minimum of car insurance with no collision coverage. Generally, where an industry requires participation in its data pipeline, they could decline the industry.

This would be a significant amount of work. It would cost time, convenience, rewards, discounts, and entire product categories. And here is the honest assessment of what it would buy: the profile would exist anyway, and it would still be vast. The values would still exist through payroll, public records, the loyalty data of people in their life or geographic proximity (location data). Opting out reaches one broker among hundreds. And the hundreds are not independent: most of the smaller brokers and people-search sites license their records downstream from the same few aggregators, so a profile deleted from a reseller regrows from the source — while the source, where deletion would actually propagate, is precisely the layer with no consumer-facing surface to petition.

But notice what this exercise proves. If a person who treats privacy as a discipline cannot exit the file, the file is not a consequence of personal carelessness, and the remedy is not personal diligence. The problem is not yours to solve — which raises the question of whose job it was.

Alone among nations

The Fair Credit Reporting Act made the United States arguably the world’s leader in data rights: see your file, dispute it, be told when it is used against you. Then the United States stopped, and the rest of the developed world did not.

The European Union built a general data-protection regime and, under the GDPR, rights that attach to any processing of personal data — access, deletion, objection to profiling, human review of significant automated decisions. The United Kingdom kept an equivalent after leaving. Canada, Japan, South Korea, Brazil, Australia, even India and China have enacted baseline national data-protection laws of varying strength.

The United States, alone among major economies, has none. What Americans have instead is a credit statute from 1970, a medical statute from 1996 that covers the doctor but not the data, and a patchwork of state laws — and the industry has read all of it more carefully than Congress has. The EU, to pick one concrete line, banned emotion-recognition AI in workplaces outright in 2024; the same products run legally today in all fifty states, monitoring the emotion and tone of call-center agents at MetLife and Humana. [23][24] Colorado and California have classified neural data as sensitive; the other forty-eight states have not. [25]

Start with the money, which is not hidden: lobbying disclosures report each company’s total spending and the bills it touched, not how the dollars divide among issues, and a firm like Oracle lobbies on everything from cloud contracts to antitrust. With that caveat: when The Markup totaled the disclosures in 2021, twenty-five registered data brokers had reported $29 million in federal lobbying in a single year — Oracle’s $9.6 million total exceeding Google’s. By 2023, RELX — the parent of LexisNexis — was reporting $3.1 million a year and naming privacy bills in its filings, its largest Washington outlay since 2008, with Experian at $1.4 million and Equifax above $1.5 million: an industry most Americans cannot name, outspending household names. And the federal ledger is the smaller example: when The Markup followed the lobbyists into the statehouses, it counted 445 lobbyists and lobbying firms representing Amazon, Apple, Google, Meta, Microsoft, and the industry’s coalitions working privacy bills in all thirty-one states that considered them — and Virginia’s law, the template most states have since copied, was originally authored by Amazon. [26]

In 2022, Congress came closer to a baseline privacy law than it had in fifty years. The American Data Privacy and Protection Act passed the House Energy and Commerce Committee 53 to 2 — a bipartisan margin that does not occur by accident. It died without a floor vote. Speaker Nancy Pelosi declined to bring it up, siding with California officials who objected that the bill’s preemption clause would override California’s stronger law; Senator Maria Cantwell, who chaired the Senate committee of jurisdiction, refused to advance it for related reasons. [27]

Note what the obstacle was. Pelosi’s refusal was not capture — it was cast to protect the stronger law California already had. That is the elegance of preemption as the industry’s one demand: it makes any federal bill a trap that springs in both directions. Pass it, and fifty states are capped at a ceiling negotiated with industry input; kill it to save California, and there is no federal law at all. The advocates’ strongest jurisdiction becomes the hostage that dies in every negotiation. The industry does not need to win the vote. It needs only to keep preemption in the text, and the bill detonates itself. The 2022 bill did. Its 2024 successor, the American Privacy Rights Act, died differently — gutted of its civil-rights and algorithmic-accountability provisions, its markup canceled with minutes of notice, its own chief sponsor cut out of the final negotiations — but it died all the same. [28]

The result is a system in which state laws remain fragmented, federal protections never arrive, and the gaps between them have become a business model.

The verdict

So the situation, stated without decoration: an industry of hundreds of registered data brokers maintains files on every American adult; identity graphs bind the files to people with near-total coverage; a marketplace layer makes the result queryable by any company with a cloud account; models convert the files into predictions that increasingly price everything you buy; and the citizen at the end of the chain has no general legal right to see, dispute, or appeal any of it. Every other major economy decided this required a law. The United States decided that it did not.

What regulation would mean is not mysterious: rights that attach to the inference and the decision, not only to the data. The right to know a score was used. The right to see it, dispute its inputs, and appeal its conclusion to a human being with the authority to overrule it — an appeal designed to be used. None of this is conceptually hard. All of it is currently moving in reverse. And the appetite for it is not hypothetical: when California opened a single-submission deletion platform covering every registered broker this January, a quarter of a million people signed up in the first eight weeks. [7] The deletion requests will not touch the derived scores, the opt-out is the remedy the industry prefers you to have; individual, and exhausting. Yet a quarter of a million people performing a largely symbolic act is not a statement about the act. It is a statement about the demand for a law.


Every prediction model rests on the assumption that the future can be inferred from the past — that who a person (or people similar to them) has been is who they will be. But anyone whose life has defied the pattern knows how hollow that assumption is. A person whose file is filled with signals pointing toward one trajectory, and who chose a different one — that person is not an outlier. They are proof that the model is measuring the wrong thing. The danger of this industry is not only that the predictions are often wrong. It is that they encourage institutions to see human beings as the sum of their correlations, when the most important parts of a person’s life are often the moments they defy them.


Update — June 17, 2026. Five days after this published, 404 Media reported that ICE appears to be buying records tied to immigrants’ tax identifiers from a data broker through a nearly $10 million contract, routed through reseller Thundercat — this after a court had already struck down an IRS-DHS agreement to share the same ITIN data directly. [29] It is the argument of this piece made literal: when the law closes a door, the broker marketplace is the open window. The buyer here is the government itself, sourcing on the commercial market the very data a court forbade it to obtain — and Senator Ron Wyden’s response points straight back at the gap documented above, noting that “the Consumer Financial Protection Bureau was on the verge of closing this loophole before Trump killed the agency and blocked that new rule from going into effect.” [29]

The regulatory gap is not an abstraction or a future risk. It is the mechanism, already operating, by which legal limits on data become optional for anyone willing to pay a broker demonstrating just how important it is that the people be outraged and demand regulation.

References

1. Acxiom, “InfoBase,” product marketing, accessed June 2026. 1,500+ core attributes per household across 162 million U.S. households and roughly 260 million individuals; full catalog exceeding 10,000 attributes including derived data. Acxiom has been a subsidiary of Omnicom Group since Omnicom’s November 2025 acquisition of IPG.

2. Acxiom, “Consumer Data Products Catalog,” Internet Archive, accessed June 2026.

3. U.S. Senate Committee on Commerce, Science, and Transportation, A Review of the Data Broker Industry: Collection, Use, and Sale of Consumer Data for Marketing Purposes, December 18, 2013.

4. Robert Bartlett, Adair Morse, Richard Stanton, and Nancy Wallace, “Consumer-Lending Discrimination in the FinTech Era,” NBER Working Paper 25943; published in Journal of Financial Economics, 2022. Face-to-face lenders charged Latinx/African-American borrowers 7.9 basis points more on purchase mortgages; fintech algorithms showed no approval discrimination and roughly 40 percent less price discrimination.

5. Consumer Financial Protection Bureau, “An Update on Credit Access and the Bureau’s First No-Action Letter,” blog post, August 2019 (archived at the Internet Archive). Upstart’s model approved 27 percent more applicants than a traditional model at 16 percent lower average APRs, with expanded access across all tested race, ethnicity, and sex segments.

6. Student Borrower Protection Center, Educational Redlining, February 2020. Hypothetical applicants identical except for institution attended: the Howard University graduate was charged $3,499 more over the life of a five-year Upstart loan than the NYU graduate. The report prompted a Senate inquiry and an independent fair-lending monitorship (Relman Colfax, for the NAACP Legal Defense Fund and SBPC, 2020–2024), whose final report identified approval disparities for Black applicants and a viable alternative model that would have caused fewer of them.

7. California Privacy Protection Agency, “Data Broker Registry,” accessed June 2026. 575+ registered brokers as of February 2026, up from 459 in spring 2025; DROP, the state’s single-submission deletion platform, recorded 242,000 sign-ups in its first eight weeks after launching January 1, 2026.

8. Privacy Rights Clearinghouse and Electronic Frontier Foundation, “Why Are Hundreds of Data Brokers Not Registering with States?,” June 2025. Merging the five state data broker registries yielded approximately 750 unique broker groups; hundreds were registered in one state but absent from others, including 291 not registered in California.

9. Mozilla Foundation, “Privacy Not Included: Cars,” 2023. All 25 brands failed; 84 percent share data with service providers or brokers; 76 percent say they can sell personal data.

10. LiveRamp, “RampID Identity Resolution,” product documentation, accessed June 2026; LiveRamp corporate filings and press materials, 2026. LiveRamp is being acquired by Publicis Groupe for $2.2 billion; Omnicom (Acxiom’s owner) is building a competing identity graph and plans to exit LiveRamp by 2028.

11. TransUnion, “TransUnion Announces Enhanced Identity Graph for Marketing Solutions,” press release, January 4, 2024. Roughly 98 percent U.S. population coverage and 99.5 percent identifier persistence, following TransUnion’s 2021 acquisition of Neustar.

12. Snowflake, “Snowflake Marketplace - Acxiom InfoBase,” 3,400+ listings from 820+ providers, verified June 2026; AWS Data Exchange, 1,000+ datasets. Acxiom is a Snowflake strategic partner; its household profiles are queryable via clean room.

13. Zest AI technical and marketing materials (thousands of features per applicant from raw bureau data); Upstart, “Upstart by the Numbers,” accessed June 2026; Consumer Financial Protection Bureau, “CFPB Announces First No-Action Letter to Upstart Network,” 2017, reviewed 2020. 2,500 variables including education; roughly 90 percent of loans decided without human intervention. Gradient-boosted decision-tree ensembles dominate regulated underwriting for tabular-data and explainability reasons.

14. Martha Poon, “What Lenders See: A History of the Fair Isaac Scorecard” (PhD diss., University of California San Diego, 2012). Fair, Isaac & Co., founded 1956, sold its first statistical credit scorecards in 1958; the FICO bureau score followed in 1989. Logistic-regression scorecards dominated regulated underwriting for decades in part because of their legibility to regulators and adverse-action requirements.

15. Consumer Financial Protection Bureau, “Consumer Financial Protection Circular 2023-03: Adverse Action Notification Requirements and the Proper Use of the CFPB’s Sample Forms Provided in Regulation B,” September 19, 2023. Creditors using complex algorithms must give specific, accurate reasons for adverse action; sample-form reason codes are insufficient.

16. Marshall Allen, “Health Insurers Are Vacuuming Up Details About You — And It Could Raise Your Rates,” ProPublica/NPR, July 17, 2018. Insurers buying race, education, marital status, net worth, and bill-payment data from brokers to predict individual health costs; LexisNexis Risk Solutions marketed 442 non-medical attributes for the purpose.

17. LexisNexis Risk Solutions, “Mortality Risk Assessment Using Non-Medical Data,” press release, August 15, 2024 (study of 50 million Americans), and product terms marketing the score as a “non-FCRA, non-medical underwriting accelerator.” See also LexisNexis Risk Solutions, “LexisNexis Risk Classifier,” press release, July 21, 2021: public records, credit attributes, and driving-behavior data combined into a relative-mortality score for accelerated (no-exam) life-insurance underwriting.

18. Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan, “Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations,” Science 366, no. 6464 (2019): 447–453.

19. LexisNexis Risk Solutions, “Master Terms and Conditions,” accessed June 2026: “non-FCRA” positioning and use restrictions.

20. Mobley v. Workday, Inc., No. 3:23-cv-00770 (N.D. Cal.). Order of July 12, 2024: Workday ruled potentially liable as an “agent” under Title VII, the ADA, and the ADEA; ADEA collective conditionally certified May 16, 2025.

21. Texas v. Allstate/Arity, filed January 13, 2025, the first enforcement action under any state comprehensive privacy law (the Texas Data Privacy and Security Act); dismissed for lack of personal jurisdiction, April 10, 2025. The consolidated private class action over the Arity SDK (N.D. Ill.) survived dismissal in March 2026, with wiretap, FCRA, and privacy-tort claims proceeding.

22. Consumer Financial Protection Bureau, “Protecting Americans From Harmful Data Broker Practices (Regulation V),” proposed rule, December 3, 2024; withdrawn by acting director Russell Vought, 90 Fed. Reg. 20568, May 15, 2025.

23. Artificial Intelligence Act (Regulation (EU) 2024/1689), Article 5(1)(f): emotion-recognition AI prohibited in workplace and educational contexts; prohibited practices effective February 2, 2025, with penalties up to 7 percent of global annual turnover. GDPR Articles 15–22 establish access, erasure, objection-to-profiling, and automated-decision rights.

24.This AI Software Is ‘Coaching’ Customer Service Workers,” TIME, 2019. Cogito’s real-time paralinguistic analysis of call-center agents; documented customers include MetLife and Humana.

25. Colorado House Bill 24-1058 (effective August 6, 2024) and California Senate Bill 1223 (effective January 1, 2025): neural data classified as sensitive personal information.

26. Alfred Ng and Maddy Varner, “The Little-Known Data Broker Industry Is Spending Big Bucks Lobbying Congress,” The Markup, April 1, 2021. 25 registered brokers spent a combined $29 million on federal lobbying in 2020, rivaling individual Big Tech firms; Oracle’s $9.57 million exceeded Google’s $8.85 million; RELX spent $2.4 million. In 2023, RELX spent $3.1 million lobbying on privacy bills; Experian $1.4 million; Equifax over $1.5 million (CyberScoop, April 2024). During ADPPA drafting, broker lobbying rose 11 percent in a quarter — RELX’s by 26 percent — and the brokers’ unified request was federal preemption of state privacy laws (Politico Pro, August 2022). At the state level: Todd Feathers and Alfred Ng, “Tech Industry Groups Are Watering Down Attempts at Privacy Regulation, One State at a Time,” The Markup, May 26, 2022. 445 lobbyists and lobbying firms actively represented Amazon, Apple, Google, Meta, Microsoft, TechNet, and the State Privacy and Security Coalition across the 31 states then considering privacy legislation; TechNet testified or supplied written comments in at least 10 states. Virginia’s 2021 law, the template for most subsequent state laws, was originally authored by Amazon with input from Microsoft (per Protocol, cited by The Markup).

27. American Data Privacy and Protection Act, H.R. 8152, 117th Cong. (2022): reported out of House Energy & Commerce 53–2. Speaker Pelosi declined to bring it to the floor over preemption of California law; Sen. Cantwell declined to advance it in the Senate.

28. American Privacy Rights Act (2024): the June 27, 2024 House Energy & Commerce markup was canceled with little notice after civil-rights and algorithmic-accountability provisions were stripped; Speaker Johnson and Majority Leader Scalise convened an eve-of-markup meeting of committee Republicans that excluded Chair Rodgers.

29. Joseph Cox, “ICE Appears to Be Buying Immigrants’ Tax Identifiers from a Data Broker,” 404 Media, June 17, 2026. A procurement worth $9,968,353.56, routed through reseller Thundercat, indicates ICE is purchasing records related to immigrants’ ITINs after a court struck down an IRS–DHS data-sharing agreement covering the same information.