Science

Physical Intelligence claims robots learn unseen tasks

Pi0.7 model uses language coaching and sparse demonstrations, prompt wording becomes a new bottleneck

Images

Image Credits:Physical Intelligence Image Credits:Physical Intelligence Image Credits:Physical Intelligence
Connie Loizos Connie Loizos techcrunch.com

Physical Intelligence claims new robot model generalises to unseen tasks, pi0.7 combines web pretraining with robot demonstrations and responds to spoken coaching, success depends on how humans phrase instructions

A two-year-old San Francisco robotics startup, Physical Intelligence, says a new model can direct robots to perform tasks it was never explicitly trained to do, according to TechCrunch. The system, called π0.7, is presented as an early “general-purpose robot brain” that can be guided through unfamiliar work using natural-language instructions rather than being retrained for each new task.

The company’s evidence centres on what it calls compositional generalisation: recombining skills learned in separate contexts to solve a novel problem. In one demonstration, researchers had a robot operate an air fryer—an appliance that appeared only twice in the training data, the company says. One episode showed a different robot pushing the air fryer closed; another, drawn from an open-source dataset, showed a robot placing a plastic bottle inside on command. π0.7 reportedly used those fragments plus broader pretraining to attempt cooking a sweet potato, improving when a human provided step-by-step verbal guidance.

The work sits in a crowded race to move beyond industrial robots that succeed mainly through repetition and tightly controlled environments. Robotics labs and companies have long depended on task-specific datasets: collect demonstrations for “open this drawer,” train a policy, then repeat for “pick up this mug.” Physical Intelligence argues that once a model can remix learned components, performance scales more favourably with additional data—an echo of the jump in capability seen when language models began to generalise across prompts.

But the same setup that enables rapid iteration also shifts responsibility onto the operator. TechCrunch reports that the team saw a low initial success rate on the air-fryer task, then raised it dramatically after spending time rewriting the instructions. The failure mode, one researcher said, was often the humans’ phrasing rather than the robot’s mechanics. That suggests a near-term deployment model where robots become more useful not by being fully autonomous, but by being “coached” by staff who learn how to talk to them—adding a new layer of labour and training that is easy to overlook when demos are edited for smoothness.

Standardised benchmarks remain a constraint. Robotics results are notoriously difficult to compare across labs because hardware differs, tasks are underspecified, and small changes in environment can flip outcomes. Without common evaluation suites and independent replication, claims of generality can be hard to price: a robot that can be walked through an air-fryer sequence may still fail when asked for a single high-level command like “make toast,” as the company itself acknowledges.

Physical Intelligence’s pitch is that coaching could reduce the cost of deploying robots in new settings by avoiding repeated data collection and retraining cycles. The immediate question is whether the gains come from a step-change in robot reasoning, or from a more practical shift in who does the adaptation work—engineers in the lab, or employees on site.

In the air-fryer test, the difference between failure and success reportedly took about half an hour of rewriting the instructions.