Intercom's new post-trained Fin Apex 1.0 beats GPT-5.4 and Claude Sonnet 4.6 at customer service resolutions

Intercom is taking an unusual gamble for a legacy software company: building its own AI model.The 15-year-old, Dublin, Ireland-based massive customer service platform announced Fin Apex 1.0 on Thursday, a small, purpose-built AI model that the company claims outperforms leading frontier models from OpenAI and Anthropic on the metrics that matter most for customer support. The model powers Intercom's existing Fin AI agent, which already handles over one million customer conversations weekly.According to benchmarks shared with VentureBeat, Fin Apex 1.0 achieves a 73.1% resolution rate—the percentage of customer issues fully resolved without human intervention—compared to 71.1% for both GPT-5.4 and Claude Opus 4.5, and 69.6% for Claude Sonnet 4.6. That roughly 2 percentage point margin may sound modest, but it's wider than the typical gap between successive generations of frontier models."If you're running large service operations at scale and you've got 10 million customers or a billion

Mar 26, 2026 0 0

Intercom's new post-trained Fin Apex 1.0 beats GPT-5.4 and Claude Sonnet 4.6 at customer service resolutions

Intercom is taking an unusual gamble for a legacy software company: building its own AI model.

The 15-year-old, Dublin, Ireland-based massive customer service platform announced Fin Apex 1.0 on Thursday, a small, purpose-built AI model that the company claims outperforms leading frontier models from OpenAI and Anthropic on the metrics that matter most for customer support.

The model powers Intercom's existing Fin AI agent, which already handles over one million customer conversations weekly.

According to benchmarks shared with VentureBeat, Fin Apex 1.0 achieves a 73.1% resolution rate—the percentage of customer issues fully resolved without human intervention—compared to 71.1% for both GPT-5.4 and Claude Opus 4.5, and 69.6% for Claude Sonnet 4.6. That roughly 2 percentage point margin may sound modest, but it's wider than the typical gap between successive generations of frontier models.

"If you're running large service operations at scale and you've got 10 million customers or a billion dollars in revenue, a delta of 2% or 3% is a really large amount of customers and interactions and revenue," Intercom CEO Eoghan McCabe told VentureBeat in a video call interview earlier this week.

The model also shows significant improvements in speed and accuracy. Fin Apex delivers responses in 3.7 seconds—0.6 seconds faster than the next-fastest competitor—and demonstrates a 65% reduction in hallucinations compared to Claude Sonnet 4.6.

Perhaps most striking for enterprise buyers: it runs at roughly one-fifth the cost of using frontier models directly, and is included in Intercom's existing "per-outcome"-based pricing structure for its existing customer plans.

What's the base model? Does it even matter?

But there's a catch. When asked to specify which base model Apex was built on—and its parameter size—Intercom declined.

"We're not sharing the base model we used for Apex 1.0—for competitive reasons and also because we plan to switch base models over time," a company spokesperson told VentureBeat. The company would only confirm that the model is "in the size of hundreds of millions of parameters."

That's a notably small model. For comparison, Meta's Llama 3.1 ranges from 8 billion to 405 billion parameters; even efficient open-weights models like Mistral 7B dwarf the sub-billion scale Intercom describes.

Whether Apex's performance claims hold up against that context—or whether the benchmarks reflect optimizations possible only in narrow, domain-specific applications—remains an open question.

Intercom says it learned from the backlash AI coding startup Cursor faced when critics accused the coding assistant of burying the fact that its Composer 2 model was built on fine-tuned open-weights models rather than proprietary technology. But the lesson Intercom drew may not satisfy skeptics: the company is transparent that it used an open-weights base, just not which one.

"We are very transparent that we have" used an open-weights model, the spokesperson said. Yet declining to name the model while claiming transparency is a contradiction that will likely draw scrutiny—particularly as more companies tout "proprietary" AI that amounts to post-trained open-source foundations.

Post-training as the new frontier

Intercom's argument is that the base model simply doesn't matter much anymore.

"Pre-training is kind of a commodity now," McCabe said. "The frontier, if you will, is actually in post-training. Post-training is the hard part. You need proprietary data. You need proprietary sources of truth."

The company post-trained its chosen foundation using years of proprietary customer service data accumulated through Fin, which now resolves 2 million customer queries per week. That process involved more than just feeding transcripts into a model. Intercom built reinforcement learning systems grounded in real resolution outcomes, teaching the model what successful customer service actually looks like—the appropriate tone, judgment calls, conversational structure, and critically, how to recognize when an issue is truly resolved versus when a customer is still frustrated.

"The generic models are trained on generic data on the internet. The specific models are trained on hyper-specific domain data," McCabe explained. "It stands to reason therefore that the intelligence of the generic models is generic, and the intelligence of the specific models is domain-specific and therefore operates in a far superior way for that use case."

If McCabe is right that the magic is entirely in post-training, the reluctance to name the base becomes harder to justify. If the foundation is truly interchangeable, what competitive advantage does secrecy protect?

A $100 million bet paying off

The announcement comes as Intercom's AI-first pivot appears to be working. Fin is approaching $100 million in annual recurring revenue and growing at 3.5x, making it the fastest-growing segment of the company's $400 million ARR business. Fin is projected to represent half of Intercom's total revenue early next year.

That trajectory represents a remarkable turnaround. When Fin launched, its resolution rate was just 23%. Today it averages 67% across customers, with some large enterprise deployments seeing rates as high as 75%.

To make this happen, Intercom grew its AI team from roughly 6 researchers to 60 over the past three years—a significant investment for a company that McCabe admits was "in a really bad place" before its AI pivot. The average growth rate for public software companies sits around 11%; Intercom expects to hit 37% growth this year.

"We're by far the first in the category to train our own model," McCabe said. "There's no one else that's going to have this for a year or more."

The speciation and specialization of AI

McCabe's thesis aligns with a broader trend that Andrej Karpathy, former AI leader at Tesla and OpenAI, recently described as the "speciation" of AI models—a proliferation of specialized systems optimized for narrow tasks rather than general intelligence.

Customer service, McCabe argues, is uniquely suited for this approach. It's one of only two or three enterprise AI use cases that have found genuine economic traction so far, alongside coding assistants and potentially legal AI. That's attracted over a billion dollars in venture funding to competitors like Decagon and Sierra—and made the space, in McCabe's words, "ruthlessly competitive."

The question is whether domain-specific models represent a durable advantage or a temporary arbitrage that frontier labs will eventually close. McCabe believes the labs face structural limitations.

"Maybe the future is that Anthropic has a big offering of many different specialized models. Maybe that's what it looks like," he said. "But the reality is that I don't think the generic models are going to be able to keep up with the domain-specific models right now."

Beyond efficiency to experience

Early enterprise AI adoption focused heavily on cost reduction—replacing expensive human agents with cheaper automated ones. But McCabe sees the conversation shifting toward experience quality.

"Originally it was like, 'Holy shit, we can actually do this for so much cheaper.' And now they're thinking, 'Wait, no, we can give customers a far better experience,'" he said.

The vision extends beyond simple query resolution. McCabe imagines AI agents that function as consultants—a shoe retailer's bot that doesn't just answer shipping questions but offers styling advice and shows customers how different options might look on them.

"Customer service has always been pretty shit," McCabe said bluntly. "Even the very best brands, you're left waiting on a call, you're bounced around different departments. There's an opportunity now to provide truly perfect customer experience."

Pricing and availability

For existing Fin customers, the upgrade to Apex comes at no additional cost. Intercom confirmed that customer pricing remains unchanged—users continue to pay per outcome as before, at $0.99 per resolved interaction, and automatically benefit from the new model.

Apex is not available as a standalone model or through an external API. It is accessible only through Fin, meaning businesses cannot license the model independently or integrate it into their own products. That constraint may limit Intercom's ability to monetize the model beyond its existing customer base—but it also keeps the technology proprietary in a practical sense, regardless of what the underlying base model turns out to be.

What's next

Intercom plans to expand Fin beyond customer service into sales and marketing—positioning it as a direct competitor to Salesforce's Agentforce vision, which aims to provide AI agents across the customer lifecycle.

For the broader SaaS industry, Intercom's move raises uncomfortable questions. If a 15-year-old customer service company can build a model that outperforms OpenAI and Anthropic in its domain, what does that mean for vendors still relying on generic API calls? And if "post-training is the new frontier," as McCabe insists, will companies claiming breakthroughs face pressure to show their work—or continue hiding behind competitive secrecy while touting transparency?

McCabe's answer to the first question, laid out in a recent LinkedIn post, is stark: "If you can't become an agent company, your CRUD app business has a diminishing future."

The answer to the second remains to be seen.