The Aggregator Graveyard: Five Unified APIs That Didn't Last

In 2023-2024, the “unified LLM API” looked like the obvious layer. Every new model had a different endpoint, key format, and error shape. A company that normalized all of them behind one OpenAI-compatible surface could charge for the convenience — or so the pitch went.

Three years later, the pure aggregator business is mostly gone. The survivors either became observability platforms, enterprise governance tools, or marketplaces with billing moats. The ones that stayed “just the API” either shut down, pivoted, or entered maintenance mode.

This post covers five concrete cases with public timelines and stated reasons. The pattern is consistent: aggregation itself is a thin margin layer that gets squeezed between providers and customers, and the companies that survived moved to adjacent surfaces with more durable economics.

OpenAI Assistants API — the official deprecation

On August 26, 2025, OpenAI announced the deprecation of the Assistants API, with a hard shutdown date of August 26, 2026 — exactly one year later. The official deprecation notice lists the replacement as the Responses API and Conversations API.

The Assistants API was OpenAI’s own attempt at a unified agent surface: persistent threads, tool definitions, code interpreter, file search, all behind one set of endpoints. It launched in beta in late 2023, went through a v2 revision in April 2024 that added streaming and improved tool calling, and then got retired once Responses reached feature parity. The migration guide is public; the cutoff is not negotiable. After August 26, 2026, every call to /v1/assistants, /v1/threads, and /v1/threads/runs returns an error — no degraded mode, no grace period.

What killed it was not competition from a startup — it was the provider itself deciding the abstraction didn’t need to be a separate product. Once the core capabilities lived in the simpler Responses surface, the dedicated Assistants endpoints became redundant overhead. The beta label never came off; the product was structurally interim from the start.

Helicone — from gateway to maintenance mode

Helicone launched as an open-source LLM observability platform with a clever integration model: change one URL and get logging, caching, and rate limiting for free. It later added an AI Gateway mode — a unified endpoint routing to 100+ models through a single key.

In March 2026, Helicone was acquired by Mintlify. The proxy still works, the GitHub repo still accepts contributions, and the Docker image is current. But the company is now in maintenance mode. No new features are planned. The gateway exists, but building new production dependencies on it requires a deliberate self-hosting commitment.

The pivot away from pure aggregation was already visible before the acquisition. The durable surface was observability — the data about the requests — not the routing layer itself. Once that became the product, the gateway was a feature, not a business.

Adept — the acqui-hire that ended the foundation-model bet

Adept was founded in 2022 to build general intelligence that could use any software tool. It raised over $415 million at a $1B+ valuation, with backers including Nvidia, Microsoft, and Greylock. The vision required training frontier multimodal models plus an actuation layer for desktop and web workflows.

On June 28, 2024, Amazon hired Adept’s CEO David Luan and several co-founders, along with a majority of the research and engineering team. Amazon also licensed Adept’s models, datasets, and agent technology. Adept did not shut down — it restructured under new CEO Zach Brock and shifted focus to enterprise agent workflows on top of existing models.

The company’s own statement was blunt: continuing to build both foundation models and an enterprise agent product would have required spending “significant attention on fundraising for our foundation models, rather than bringing to life our agent vision.” The capital cost of training frontier models killed the original aggregator-plus-models strategy. What remains is a narrower enterprise workflow company.

kluster.ai — inference platform to full sunset

kluster.ai offered inference, fine-tuning, and dedicated GPU deployments. It announced a partnership with Aethir in June 2025 for access to enterprise-grade infrastructure across 20+ regions. On July 17, 2025, the company issued an end-of-life notice for its inference, fine-tuning, and deployment products.

The pivot was to “Verify by kluster.ai” — an IDE-integrated code verification tool for VS Code, Cursor, and Claude Code. A companion hallucination detection leaderboard appeared on Hugging Face. Neither generated enough momentum. On June 9, 2026, all kluster.ai services were fully sunset. The team subsequently joined MITO, an AI video-creation startup.

The inference aggregation layer never achieved the scale or margin to sustain the business once the initial GPU partnership economics shifted. The company had positioned itself as an alternative to OpenRouter for open-weight models, but the Aethir relationship did not translate into durable unit economics. The tooling pivot was an attempt to move to higher-margin developer infrastructure; it did not succeed.

Yupp AI — crowdsourced evaluation marketplace

Yupp AI launched in June 2025 as a free platform giving users access to responses from over 500 generative models, with credits earned for feedback. The thesis was a two-sided marketplace: users got model choice and credits; AI labs got real-world preference data to evaluate their models. The company raised $33 million from a16z crypto, Google Chief Scientist Jeff Dean, Biz Stone, Evan Sharp, and others.

On April 1, 2026, founder Pankaj Gupta announced the wind-down. The platform remained accessible until April 15 for data export. The stated reason was insufficient product-market fit in a changing market: “The future is not just models but agentic systems… This makes survival and monetary viability for crowdsourced model evaluation tools like Yupp difficult.”

The company had onboarded 1.3 million users and secured some AI labs as paying customers, but the core offering did not generate durable revenue. Remaining funds were returned to investors. Some team members transitioned to roles at another AI company.

What survives

The aggregators that remain are not selling “one key, every model” as a standalone paid product. OpenRouter survives as a marketplace with billing aggregation and catalog breadth. LiteLLM survives as the default compatibility shim in the open-source ecosystem. Martian survives by selling enterprise governance and compliance workflows, with routing as the wedge.

Pure aggregation — normalizing endpoints without owning billing relationships, observability data, or enterprise procurement — has not proven to be a durable standalone business. The companies that tried it either moved to adjacent surfaces with stronger economics or exited.

The pattern is not mysterious. A hosted router sits between two parties who both have pricing power. It adds latency, inherits every provider’s reliability problems, and captures almost none of the value of the tokens flowing through it. The surfaces that survived — observability, marketplace billing, enterprise governance — sit on top of that layer or beside it, not inside it.

For the concrete mechanics of routing policy that age gracefully, see the models and providers documentation. RoutePlane’s design choices are shaped by the same graveyard this post documents: policy over learned classifiers, local execution over hosted dependency, and no rent on your own tokens.

Sources: OpenAI deprecation notice (developers.openai.com, Aug 2025) · Helicone Mintlify acquisition coverage (ChatForest review, May 2026) · Adept blog post and TechCrunch/Reuters reporting (June 2024) · kluster.ai EOL notice and Benched.ai coverage (July 2025–June 2026) · Yupp AI shutdown announcement (Economic Times, April 2026).