General Compute launches inference cloud
Startup orders SambaNova chips to chase cheaper AI tokens, neocloud model leans on repurposed crypto infrastructure
Images
Tim Fernholz
techcrunch.com
General Compute launched its AI inference cloud last week with a small seed round and a large hardware wager. According to TechCrunch, the startup raised $15 million at a $60 million post-money valuation and says it has roughly $300 million worth of SambaNova chips on order to power its service.
The pitch rests on a quiet shift inside the AI boom: training grabs headlines, but inference is where products live and where cloud bills recur. GPUs remain scarce and expensive, and TechCrunch notes that many in the industry increasingly treat them as a compromise for inference workloads rather than the ideal tool. That has opened space for “neoclouds” that rent out compute optimized specifically for model responses—speed, latency, and cost per token—rather than for months-long training runs.
General Compute is betting that specialized inference silicon will matter enough to justify building a cloud around a single supplier. It is deploying chips built by SambaNova, an Intel-backed company that has been less visible in recent Silicon Valley conversations but is positioning new hardware as an alternative to both Nvidia GPUs and purpose-built rivals. SambaNova’s sales claims—hundreds of tokens per second versus lower throughput on GPUs, and air-cooled systems that can fit into existing data centers—are designed to translate directly into unit economics: more tokens per rack, less retrofitting, faster time to revenue.
That “drop-in” promise is also why the company is pursuing colocation deals with data-center operators and crypto miners, TechCrunch reports. Miners already own power contracts, buildings, and cooling capacity; when bitcoin economics tighten, their infrastructure becomes a stranded asset looking for a new workload. An inference provider can show up with hardware and a revenue-sharing arrangement, turning a sunk-cost warehouse into a billable service without waiting for new grid connections or permits.
The competitive map is unstable. TechCrunch describes strained capacity at established players such as Groq and Cerebras, while investors hunt for the next breakout platform. At the same time, the article points to a market logic that cuts against any single winner: customers increasingly want to route requests across multiple models and providers to manage cost and performance. If that becomes the norm, inference clouds become interchangeable pipes, and the prize shifts to whoever can deliver the cheapest reliable tokens at scale.
For now, General Compute has a seed check, an order book full of chips, and a product it says is already running an open-source model faster than rivals. The rest depends on whether those promised racks arrive before the market standardizes around someone else’s hardware.