Cursor's Composer 2 was secretly built on a Chinese AI model — and it exposes a deeper problem with Western open-source AI

The $29.3 billion AI coding tool just got caught with its provenance showing. When Cursor launched Composer 2 last week — calling it "frontier-level coding intelligence" — it presented the model as evidence that the company is a serious AI research lab, not just a forked integrated development environment (IDE) wrapping someone else's foundation model. What the announcement omitted was that Composer 2 was built on top of Kimi K2.5, an open-source model from Moonshot AI, a Chinese startup backed by Alibaba, Tencent and HongShan (the firm formerly known as Sequoia China).A developer named Fynn (@fynnso) on X figured it out within hours. By setting up a local debug proxy server and routing Cursor's API traffic through it, Fynn intercepted the outbound request and found the model ID in plain sight: accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast. "So composer 2 is just Kimi K2.5 with RL," Fynn wrote. "At least rename the model ID." The post racked up 2.6 million views. In a follow-up

Mar 23, 2026 0 0

Cursor's Composer 2 was secretly built on a Chinese AI model — and it exposes a deeper problem with Western open-source AI

The $29.3 billion AI coding tool just got caught with its provenance showing. When Cursor launched Composer 2 last week — calling it "frontier-level coding intelligence" — it presented the model as evidence that the company is a serious AI research lab, not just a forked integrated development environment (IDE) wrapping someone else's foundation model. What the announcement omitted was that Composer 2 was built on top of Kimi K2.5, an open-source model from Moonshot AI, a Chinese startup backed by Alibaba, Tencent and HongShan (the firm formerly known as Sequoia China).

A developer named Fynn (@fynnso) on X figured it out within hours. By setting up a local debug proxy server and routing Cursor's API traffic through it, Fynn intercepted the outbound request and found the model ID in plain sight: accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast.

"So composer 2 is just Kimi K2.5 with RL," Fynn wrote. "At least rename the model ID." The post racked up 2.6 million views.

In a follow-up, Fynn noted that Cursor's previous model, Composer 1.5, blocked this kind of request interception — but Composer 2 did not, calling it "probably an oversight." Cursor quickly patched it, but the fact was clearly out.

Cursor's VP of Developer Education, Lee Robinson, confirmed the Kimi connection within hours, and co-founder Aman Sanger acknowledged it was a mistake not to disclose the base model from the start.

But the story that matters here is not about one company's disclosure failure. It is about why Cursor — and likely many other AI product companies — turned to a Chinese open model in the first place.

The open-model vacuum: Why Western companies keep reaching for Chinese foundations

Cursor's decision to build on Kimi K2.5 was not random. The model is a 1 trillion parameter mixture-of-experts architecture with 32 billion active parameters, a 256,000-token context window, native image and video support, and an Agent Swarm capability that runs up to 100 parallel sub-agents simultaneously.

Released under a modified MIT license that permits commercial use, Kimi K2.5 is competitive with the best models in the world on agentic benchmarks and scored first among all models on MathVista at release.

When an AI product company needs a strong open model for continued pretraining and reinforcement learning — the kind of deep customization that turns a foundation into a differentiated product — the options from Western labs have been surprisingly thin.

Meta's Llama 4 Scout and Maverick shipped in April 2025, but they were severely lacking, and the much-anticipated Llama 4 Behemoth has been indefinitely delayed. As of March 2026, Behemoth still has no public release date, with reports suggesting Meta's internal teams are not convinced the 2-trillion-parameter model delivers enough of a performance leap to justify shipping it.

Google's Gemma 3 family topped out at 27 billion parameters — excellent for edge and single-accelerator deployment, but not a frontier-class foundation for building production coding agents. Gemma 4 has yet to be announced, though it has sparked speculation that a release may be imminent.

And then there's OpenAI, which released arguably the most conspicuous American open source contender, the gpt-oss family (in 20-billion and 120-billion parameter variants) in August 2025. Why wouldn't Cursor build atop this model if it needed a base model to fine-tune?

The answer lies in the "intelligence density" required for frontier-class coding. While gpt-oss-120b is a monumental achievement for Western open source—offering reasoning capabilities that rival proprietary models like o4-mini—it is fundamentally a sparse Mixture-of-Experts (MoE) model that activates only 5.1 billion parameters per token. For a general-purpose reasoning assistant, that is an efficiency masterstroke; for a tool like Composer 2, which must maintain structural coherence across a 256,000-token context window, it is arguably too "thin." By contrast, Kimi K2.5 is a 1-trillion-parameter titan that keeps 32 billion parameters active at any given moment. In the high-stakes world of agentic coding, sheer cognitive mass still dictates performance, and Cursor clearly calculated that Kimi’s 6x advantage in active parameter count was essential for synthesizing the "context explosion" that occurs during complex, multi-step autonomous programming tasks.

Beyond raw scale, there is the matter of structural resilience. OpenAI’s open-weight models have gained a quiet reputation among elite developer circles for being "post-training brittle"—models that are brilliant out of the box but prone to catastrophic forgetting when subjected to the kind of aggressive, high-compute reinforcement learning Cursor required.

Cursor didn't just apply a light fine-tune; they executed a "4x scale-up" in training compute to bake in their proprietary self-summarization logic. Kimi K2.5, built specifically for agentic stability and long-horizon tasks, provided a more durable "chassis" for these deep architectural renovations. It allowed Cursor to build a specialized agent that could solve competition-level problems, like compiling the original Doom for a MIPS architecture, without the model's core logic collapsing under the weight of its own specialized training.

That leaves a gap. And Chinese labs — Moonshot, DeepSeek, Qwen, and others — have filled it aggressively. DeepSeek's V3 and R1 models caused a panic in Silicon Valley in early 2025 by matching frontier performance at a fraction of the cost. Alibaba's Qwen3.5 family has shipped models at nearly every parameter count from 600 million to 397 billion active parameters. Kimi K2.5 sits squarely in the sweet spot for companies that want a powerful, open, customizable base.

Cursor is not the only product company in this position. Any enterprise building specialized AI applications on top of open models today confronts the same calculus: the most capable, most permissively licensed open foundations disproportionately come from Chinese labs.

What Cursor actually built — and why the base model matters less than you think

To its credit, Cursor did not just slap a UI on Kimi. Lee Robinson stated that roughly a quarter of the total compute used to build Composer 2 came from the Kimi base, with the remaining three quarters from Cursor's own continued training. The company's technical blog post describes a technique called self-summarization that addresses one of the hardest problems in agentic coding: context overflow during long-running tasks.

When an AI coding agent works on complex, multi-step problems, it generates far more context than any model can hold in memory at once. The typical workaround — truncating old context or using a separate model to summarize it — causes the agent to lose critical information and make cascading errors. Cursor's approach trains the model itself to compress its own working memory in the middle of a task, as part of the reinforcement learning process. When Composer 2 nears its context limit, it pauses, compresses everything down to roughly 1,000 tokens, and continues. Those summaries are rewarded or penalized based on whether they helped complete the overall task, so the model learns what to retain and what to discard over thousands of training runs.

The results are meaningful. Cursor reports that self-summarization cuts compaction errors by 50 percent compared to heavily engineered prompt-based baselines, using one-fifth the tokens. As a demonstration, Composer 2 solved a Terminal-Bench problem — compiling the original Doom game for a MIPS processor architecture — in 170 turns, self-summarizing over 100,000 tokens repeatedly across the task. Several frontier models cannot complete it. On CursorBench, Composer 2 scores 61.3 compared to 44.2 for Composer 1.5, and reaches 61.7 on Terminal-Bench 2.0 and 73.7 on SWE-bench Multilingual.

Moonshot AI itself responded supportively after the story broke, posting on X that it was proud to see Kimi provide the foundation and confirming that Cursor accessed the model through an authorized commercial partnership with Fireworks AI, a model hosting company. Nothing was stolen. The use was commercially licensed.

Beyond attribution: The silence raises licensing and governance questions

Cursor co-founder Aman Sanger acknowledged the omission, saying it was a miss not to mention the Kimi base in the original blog post. The reasons for that silence are not hard to infer. Cursor is valued at nearly $30 billion on the premise that it is an AI research company, not an integration layer. And Kimi K2.5 was built by a Chinese company backed by Alibaba — a sensitive provenance at a moment when the US-China AI relationship is strained and government and enterprise customers increasingly care about supply chain origins.

The real lesson is broader. The whole industry builds on other people's foundations. OpenAI's models are trained on decades of academic research and internet-scale data. Meta's Llama is trained on data it does not always fully disclose. Every model sits atop layers of prior work. The question is what companies say about it — and right now, the incentive structure rewards obscuring the connection, especially when the foundation comes from China.

For IT decision-makers evaluating AI coding tools and agent platforms, this episode surfaces practical questions: do you know what's under the hood of your AI vendor's product? Does it matter for your compliance, security, and supply chain requirements? And is your vendor meeting the license obligations of its own foundation model?

The Western open-model gap is starting to close — but slowly

The good news for enterprises concerned about model provenance is that it does seem Western open models are about to get significantly more competitive. NVIDIA has been on an aggressive release cadence. Nemotron 3 Super, released on March 11, is a 120-billion-parameter hybrid Mamba-Transformer model with 12 billion active parameters, a 1-million-token context window, and up to 5x higher throughput than its predecessor. It uses a novel latent mixture-of-experts architecture and was pre-trained in NVIDIA's NVFP4 format on the Blackwell architecture. Companies including Perplexity, CodeRabbit, Factory, and Greptile are already integrating it into their AI agents.

Days later, NVIDIA followed with Nemotron-Cascade 2, a 30-billion-parameter MoE model with just 3 billion active parameters that outperforms both Qwen 3.5-35B and the larger Nemotron 3 Super across mathematics, code reasoning, alignment, and instruction-following benchmarks. Cascade 2 achieved gold-medal-level performance on the 2025 International Mathematical Olympiad, the International Olympiad in Informatics, and the ICPC World Finals — making it only the second open-weight model after DeepSeek-V3.2-Speciale to accomplish that. Both models ship with fully open weights, training datasets, and reinforcement learning recipes under permissive licenses — exactly the kind of transparency that Cursor's Kimi episode highlighted as missing.

What IT leaders should watch: The provenance question is not going away

The Cursor-Kimi episode is a preview of a recurring pattern. As AI product companies increasingly build differentiated applications through continued pretraining, reinforcement learning, and novel techniques like self-summarization on top of open foundation models, the question of which foundation sits at the bottom of the stack becomes a matter of enterprise governance — not just technical preference.

NVIDIA's Nemotron family and the anticipated Gemma 4 represent the strongest near-term candidates for closing the Western open-model gap. Nemotron 3 Super's hybrid architecture and million-token context window make it directly relevant for the same agentic coding use cases that Cursor addressed with Kimi. Cascade 2's extraordinary intelligence density — gold-medal competition performance at just 3 billion active parameters — suggests that smaller, highly optimized models trained with advanced RL techniques can increasingly substitute for the massive Chinese foundations that have dominated the open-model landscape.

But for now, the line between American AI products and Chinese model foundations is not as clean as the geopolitical narrative suggests. One of the most-used coding tools in the world runs on a model backed by Alibaba — and may not originally have been meeting the attribution requirements of the license that enabled it. Cursor says it will disclose the base model next time. The more interesting question is whether, next time, it will have a credible Western alternative to disclose.