People sell private calls and daily life to train AI, data marketplaces pay per minute as web scraping dries up, contributors keep the deepfake and extortion risk

Images

Silicon Valley’s hunger for human-grade data has given rise to a thriving industry of data marketplaces. Illustration: Guardian Design/Getty Images theguardian.com

Gig AI trainers, who upload everything from scenes around them to photos, videos, and audio of themselves, are at the frontlines of a new global data gold rush. Photograph: Arun Sankar/AFP via Getty Images theguardian.com

Cape Town, South Africa. Photograph: Peter Titmuss/Universal Images Group/Getty Images theguardian.com

A Chicago welding apprentice says he made a few hundred dollars by letting an AI training app record his private phone conversations at 50 cents a minute. In Cape Town, a 27-year-old earned $14 filming his daily walk for an “urban navigation” dataset. According to The Guardian, platforms such as Neon Mobile, Kled AI and Silencio are paying thousands of people to sell fragments of their lives—texts, calls, ambient audio, photos and video—to feed the next generation of AI models.

The pitch is simple: the internet’s free training data is getting harder to scrape, either because publishers are restricting access or because the best public datasets have already been exhausted. The Guardian cites estimates that high-quality text suitable for training could run thin as soon as 2026, pushing companies toward alternatives: synthetic data loops that degrade model quality, or paid “human-grade” data that captures real-world context. Data marketplaces step into that gap by turning everyday life into a supply chain—micro-assignments, per-minute rates, and bonuses for novelty (a hotel lobby, a specific street corner, a particular accent).

The economic bargain is asymmetric. The buyer gets durable training value—voice, mannerisms, location cues, social relationships—while the seller gets a one-off payment and keeps the long tail of risk. Once a voice sample or chat history has been ingested, it can be replicated, recombined, and used in ways the contributor cannot audit. The Guardian notes concerns ranging from deepfakes and identity theft to doxxing and blackmail, but the platforms’ core advantage is that these downstream costs sit with individuals, not the marketplace or the AI lab. Even the decision to participate is often framed as resignation: some contributors tell the paper they assume tech companies already collect their data, so they may as well be paid.

The model echoes an earlier internet pattern—users as unpaid producers—except the commodity has shifted from posts and photos to biometric identity and private interaction. Social platforms monetised attention while pushing moderation, harassment and reputational fallout onto users; these marketplaces monetise intimacy while pushing privacy and security exposure onto the people supplying the raw material. The “gig” framing also matters: contributors are not employees with bargaining power, insurance, or enforceable limits on re-use. They are vendors in a market where the buyer sets the terms and the product can be copied indefinitely.

The most concrete detail in The Guardian’s reporting is also the most revealing: a teenager selling family and friend conversations by the minute. The transaction ends when the payout hits, but the data does not.