North America

California forces AI training data disclosures

Judge rejects xAI bid to pause AB 2013, Compliance costs land first on new entrants

Images

Photo of Ashley Belanger Photo of Ashley Belanger arstechnica.com

A US federal judge has refused Elon Musk’s request to temporarily block a California law that forces AI companies to disclose key details about their training data. The ruling means xAI must comply with Assembly Bill 2013 while its lawsuit continues, according to Ars Technica.

The law, which took effect in January, requires AI developers whose models are accessible in California to publish a summary of their training data sources: where the data came from, when it was collected, whether collection is ongoing, and whether the datasets include copyrighted, trademarked, or patented material. Companies must also indicate whether they licensed or purchased data, whether personal information was included, and how much synthetic data was used.

xAI argued the disclosures would expose trade secrets and erase the value of its dataset strategy. The company’s complaint, as described by Ars Technica, warned that revealing sources, dataset sizes, and cleaning methods would let rivals infer what xAI has that they do not—and then replicate it. In xAI’s own hypothetical, if OpenAI learned xAI used an “important dataset” that OpenAI lacked, OpenAI would acquire it, and vice versa. The pitch was straightforward: a mandate sold as transparency would function as a competitor’s shopping list.

US District Judge Jesus Bernal was unconvinced. In his order, he said xAI did not show the law required disclosure of actual trade secrets, and criticised the company for leaning on “abstractions and hypotheticals” rather than identifying specific information that would be irreparably harmed by publication. The judge also emphasised the state’s interest in helping the public understand how models are trained.

The practical effect is asymmetric pressure. Large incumbents can treat compliance as a fixed cost and amortise it across products, lawyers, and lobbying budgets. Newer challengers face a choice between revealing their differentiators or spending scarce capital on compliance strategy and litigation. Even if the disclosures stop short of revealing model weights or proprietary code, they can still narrow the gap between firms by standardising what must be made public.

The case also shows why US AI regulation is fragmenting into state-level regimes. Federal rules remain politically gridlocked, but states can legislate disclosure, safety, and consumer protection—and by doing so, set de facto national standards for any company that wants access to California’s market.

For xAI, the immediate outcome is procedural but concrete: the lawsuit continues, yet the disclosures begin now, not after the courts finish arguing about what counts as a trade secret.