When deep research isn't enough for your business: Sakana AI launches 'ultra deep research' agent for 100+ page reports in 8 hours

Tokyo-based AI startup Sakana AI has officially launched its first commercial product, Sakana Marlin. Billed as a "Virtual CSO" (Chief Strategy Officer), Marlin is an autonomous, B2B research agent that deliberately abandons the instantaneous text generation of modern chatbots in favor of deep, long-horizon reasoning. What sets Marlin apart from the current ecosystem of AI tools is its temporal scale: instead of returning an answer in seconds, it runs continuous, self-governing reasoning loops for up to eight hours at a time to deliver deeply researched, well cited, 100-page strategy reports and executive slides. The company posted sample reports generated my Marlin on its product website here.Available immediately via the company’s website with pricing starting at a pay-as-you-go tier, the platform is designed strictly for enterprise use—specifically targeting corporations, financial institutions, and think tanks. The generative AI hype cycle has largely been defined by speed. For the

Jun 15, 2026 0 0

When deep research isn't enough for your business: Sakana AI launches 'ultra deep research' agent for 100+ page reports in 8 hours

Tokyo-based AI startup Sakana AI has officially launched its first commercial product, Sakana Marlin.

Billed as a "Virtual CSO" (Chief Strategy Officer), Marlin is an autonomous, B2B research agent that deliberately abandons the instantaneous text generation of modern chatbots in favor of deep, long-horizon reasoning.

What sets Marlin apart from the current ecosystem of AI tools is its temporal scale: instead of returning an answer in seconds, it runs continuous, self-governing reasoning loops for up to eight hours at a time to deliver deeply researched, well cited, 100-page strategy reports and executive slides. The company posted sample reports generated my Marlin on its product website here.

Available immediately via the company’s website with pricing starting at a pay-as-you-go tier, the platform is designed strictly for enterprise use—specifically targeting corporations, financial institutions, and think tanks.

The generative AI hype cycle has largely been defined by speed. For the past two years, the industry standard has been the ability to generate a poem, a line of code, or a surface-level summary in mere milliseconds. But the enterprise frontier is rapidly shifting from shallow, rapid generation to deep, methodical reasoning.

With Marlin, major businesses are no longer asking how fast an AI can answer, but how deeply it can think.

The Product: A Virtual CSO

What exactly is a business getting when they deploy Sakana Marlin? The workflow is fundamentally different from typical large language model (LLM) interactions. Rather than engaging in a tedious back-and-forth prompt engineering session, the user simply provides a core research topic. Following a brief initial exchange to sharpen the scope and direction of the investigation, the human steps away entirely.

For the next several hours, Marlin operates as a self-contained digital strategy team. It formulates its own initial hypotheses, navigates the web to gather data, cross-references sources to verify findings, and maps the causal dynamics within complex business environments. It is effectively searching for the "winning formula" within a sea of noise.

Think of it less like a search engine and more like a junior strategy consultant locked in a room with a whiteboard and an internet connection. You provide the strategic prompt in the morning, and by the end of the workday, the system delivers a comprehensive, professional-grade portfolio.

In Marlin's case, the final output is not a generic text blob; it is a structured set of strategic options, complete with executive summary slides, appendices, references, and a deeply researched report.

The company highlighted several real-world use cases to demonstrate Marlin's capacity for complex synthesis, including generating detailed resolution scenarios for a theoretical blockade of the Strait of Hormuz, mapping out the fragmented global AI regulation patchwork, and analyzing macroeconomic trends like the return of "bond vigilantes".

Sakana says Marlin relies on multiple AI models, but did not provide specific model names or providers. I've reached out on X to find out more and will update when I receive a repsonse.

The Engine of Long-Horizon Reasoning

Under the hood, Marlin is the commercial culmination of Sakana AI’s extensive laboratory breakthroughs over the past two years.

The product is powered by an exploration engine relying on Sakana's own prior research breakthrough, Adaptive Branching Monte Carlo Tree Search (AB-MCTS), and leverages frameworks derived from "The AI Scientist," an earlier Sakana AI research project featured in the journal Nature that successfully automated the scientific discovery process from ideation to peer review.

To understand how this works in practice, consider a real-world analogy: modern chess engines. When a computer plays chess, it doesn't just look at the board and guess; it plays out thousands of potential future moves, evaluating the strength of each resulting position before committing to an action.

Marlin’s AB-MCTS engine does something similar for research.

Inside the Engine: The Mechanics of AB-MCTS

The chronology of this technology traces back to June 2025, when Sakana AI first introduced the framework to the public alongside the research paper “Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search”.

At that time, to encourage developer experimentation with collective AI intelligence, the company released the underlying algorithm as an open-source software library called TreeQuest, distributed under the permissive Apache 2.0 license. This open-source milestone laid the technical foundation for what would eventually evolve into the proprietary, enterprise-grade Marlin product a year later.

Traditionally, when developers attempt to extract higher-quality reasoning from large language models, they rely on a brute-force method called "repeated sampling"—essentially running the model dozens of times in parallel and hoping one of the answers is correct. However, repeated sampling operates blindly; it cannot evaluate its own intermediate steps or pivot based on external feedback.

AB-MCTS replaces this paradigm with a principled, multi-turn approach driven by a Bayesian decision framework. As the AI constructs a strategy report, the system treats the research process as a branching tree of possibilities. At each node of the tree, the algorithm dynamically balances two distinct behaviors based on external feedback signals:

Going Wider (Exploration): Spawning entirely new, alternative hypotheses or candidate responses when the current path yields diminishing returns or unresolved contradictions.
Going Deeper (Exploitation): Methodically refining, auditing, and building upon an existing candidate solution that shows high strategic promise.

What transforms this from a laboratory experiment into a commercial engine is its extension into Multi-LLM AB-MCTS.

Sakana AI’s architecture introduces a critical third dimension to the search tree: the ability to dynamically choose which model to invoke for a specific sub-task, treating the industry’s leading frontier models as a plug-and-play collective intelligence network.

According to technical documentation published by the company, the engine can coordinate highly heterogeneous models—allowing an orchestration model to delegate initial ideation to one LLM, while utilizing a reasoning-heavy model to audit, verify, and correct intermediate errors generated earlier in the search tree.

By scaling up compute at inference time—leveraging the distinct "personalities" and strengths of multiple foundation models over thousands of automated cycles—AB-MCTS provides the mathematical guardrails Marlin requires. It ensures that the resulting 100-page strategy reports are not merely long-winded AI generations, but the highly vetted product of systemic, automated trial-and-error.

Licensing, Data, and Enterprise Implications

It is crucial to note that Sakana Marlin is distinctly not a general consumer tool; it is a commercial software-as-a-service (SaaS) offering restricted to corporate entities, organizations, and sole proprietors.

For enterprises, licensing and data handling terms are often the determining factors in software adoption. Unlike many consumer-grade AI tools that silently harvest user inputs and proprietary data to train future foundational models, Sakana Marlin operates under a strict, enterprise-grade data policy.

Neither Sakana AI nor its external AI service providers will use customer data or inputs for model training or fine-tuning unless the client provides explicit opt-in consent.

Even with consent, data is heavily processed to remove personally identifiable information. This closed-loop security is absolutely vital for companies handling sensitive M&A research, unreleased product strategies, or proprietary market analyses.

The commercial licensing is structured into tiered pricing models that reflect its enterprise nature:

Pay-as-you-go: Users can purchase credits on demand, with a single run costing 100 credits, and add-on credits priced at ¥98 ($0.61 USD) each.
Pro Plan: At ¥150,000 ($935.68 USD) per month, businesses receive 2,000 credits, bringing down the cost of add-on credits to ¥90 ($0.56 USD).
Team Plan: Geared toward larger departments, this ¥400,000 ($2,495.14 USD) per month tier includes 6,000 credits, lowering add-on costs to ¥85 ($0.53 USD) per credit.
Enterprise: Fully custom quotes with dedicated support and customized credit allocations.

Why Sakana Is Worth Watching

Sakana AI’s transition into a commercial enterprise powerhouse is rooted in the pedigree of its founders, who famously helped spark the current generative AI boom.

Formed in Tokyo in 2023, the startup was co-founded by Llion Jones—a co-author of Google’s seminal 2017 “Attention Is All You Need” paper who coined the term “transformer”—and David Ha, a former Google Brain researcher and head of research at Stability AI.

The decision to build a new laboratory outside the Silicon Valley bubble was a deliberate rejection of the current AI ecosystem. At a TED AI conference in late 2025, Jones candidly expressed that he was "absolutely sick" of transformers, warning that the intense pressure from investors and the hyper-fixation on scaling single, monolithic models had calcified the industry's creativity and blinded researchers to the next major breakthrough.

To break free from this "big company-itis," Jones and Ha structured Sakana AI around principles of biomimicry and evolutionary computing.

The company's name, derived from the Japanese word for fish, reflects its core technical philosophy: leveraging collective intelligence similar to schools of fish, ant colonies, or insect swarms. Rather than attempting to build one massive, do-it-all foundation model, Sakana’s research has consistently focused on deploying networks of smaller, specialized models that collaborate dynamically to adapt to complex environments.

This philosophy posits that by treating individual AI models as members of a "dream team" with complementary strengths, systems can achieve more robust and cost-effective reasoning than relying on sheer scale alone.

This nature-inspired approach quickly yielded dividends in rigorous, competitive testing. Sakana AI has made significant strides in "inference-time scaling"—allocating computational resources during the problem-solving phase to allow models to think, iterate, and refine their own answers over extended periods.

In early 2026, the company’s ALE-Agent took first place in the highly complex AtCoder Heuristic Contest (AHC058), a combinatorial optimization challenge, outperforming over 800 top-tier human programmers by autonomously rebuilding and testing hundreds of solutions over a four-hour window.

Similarly, Sakana introduced "RL Conductor," a small 7-billion-parameter model trained via reinforcement learning specifically to orchestrate and delegate tasks among a diverse pool of worker models—ranging from GPT-5 to Claude Sonnet 4—achieving state-of-the-art results on reasoning benchmarks at a fraction of traditional computing costs.

Sakana's rapid evolution from a disruptive research lab to a commercial software provider has attracted intense attention from global financial heavyweights.

By late 2025, the Tokyo-based startup secured a massive Series B funding round that pushed its post-money valuation past $2.6 billion, cementing its status as one of Japan’s most highly valued private tech companies. The firm boasts a sprawling roster of strategic investors, including early venture backers Khosla Ventures, Lux Capital, and New Enterprise Associates (NEA), alongside industry titans like Nvidia and Google.

As Sakana has expanded its focus toward mission-critical sectors like defense and finance, it has also drawn investments from major global banking institutions like Mitsubishi UFJ Financial Group (MUFG) and Citi, as well as enterprise tech giant Salesforce, positioning the startup to actively reshape corporate AI infrastructure from the ground up.

Community Reactions and Field Testing

Sakana AI’s shift toward commercial, long-horizon agents did not happen in a vacuum. The company ran a rigorous closed beta test beginning in April 2026, putting the tool in the hands of approximately 300 professionals across financial institutions, consulting firms, and think tanks. The feedback underscores a stark qualitative difference between standard generative chatbots and Marlin’s autonomous, fact-driven approach.

A senior consultant at a major Tokyo consulting firm noted that the tool "exceeded expectations by discovering angles we hadn't even imagined," praising its ability to match human comprehensiveness while stripping away human bias. Meanwhile, a cybersecurity division at a major Japanese IT system integrator lauded the system for providing "a highly convincing report driven by high-quality, primary research," rather than relying on recycled secondary sources.

On social media, the company’s announcement resonated with the broader tech community's growing appetite for autonomous agents.

As the AI industry matures, the value proposition is clearly shifting. Tools that act as fast, conversational encyclopedias are becoming commoditized. With Sakana Marlin, the focus moves entirely to separating the heavy lifting of thinking from the final act of deciding. By delegating the exhaustive mapping of causal dynamics to an agent capable of sustained reasoning, human executives are free to do what they do best: take action.