North America

Anthropic weakens its safety pledge

Company drops commitment to pause model scaling as AI race accelerates, Voluntary restraint becomes a cost line item

Images

Anthropic CEO Dario Amodei has repeatedly emphasized his company's commitment to safety.
                            
                              Bhawika Chhabra/Reuters Anthropic CEO Dario Amodei has repeatedly emphasized his company's commitment to safety. Bhawika Chhabra/Reuters businessinsider.com

Anthropic is dropping the most distinctive part of its early “safety pledge” as it recalibrates for a faster-moving AI market. According to Business Insider, the company said it will no longer commit to pausing the scaling or delaying the deployment of new models when capability gains outpace its own safety measures — a principle that had set it apart from rivals since its 2023 Responsible Scaling Policy.

The change is being justified in practical terms. Anthropic’s chief science officer Jared Kaplan told Time that unilateral commitments no longer make sense “if competitors are blazing ahead,” and the company’s own blog post points to intensified competition and an “anti-regulatory political climate” as reasons for rewriting the policy. The revised framework keeps a narrower promise to delay development or release of a “highly capable” model, but only under more limited conditions.

The shift highlights a predictable feature of voluntary restraint in markets where the payoff is uncertain and the downside is immediate. A pledge to slow down is easiest to maintain when it functions as branding — reassuring customers, partners, and policymakers — and when the opportunity cost is tolerable. As the product race tightens, that same pledge becomes a self-imposed handicap that competitors can exploit. In that environment, “safety” stops being a differentiator and starts looking like foregone revenue.

Anthropic’s explanation also points to a second dynamic: the company argues that the highest risk tiers in its framework cannot be contained by any single firm. That is a candid admission about the limits of corporate governance in frontier-model development: even if one company halts, the overall system does not. The result is a familiar pattern in which firms keep asking for government rules, while simultaneously adapting to the reality that rules may not arrive in time — or may arrive in forms that reward incumbents with compliance departments.

The practical question is not whether companies prefer safety in the abstract, but who pays for it. Slower deployment imposes costs on the developer first, while many of the potential harms — misuse, downstream failures, or systemic effects — are distributed across users, victims, and public institutions. In the absence of enforceable standards shared across the sector, the firm that sticks to the strictest internal brake risks being the only one to lose market share.

Anthropic built its public identity on being the cautious alternative to OpenAI, including its early decision to delay releasing Claude in 2022. It is now signaling that, in the current race, it will not be the one to stop training first.