Opinion: AI is not a product — it’s an environment

Bill Hilf, a veteran systems architect and conservationist, argues that we must prioritize system diversity and "recovery over performance" to prevent a chain reaction of failures that could collapse our critical infrastructure. Read More

Opinion: AI is not a product — it’s an environment
(BigStock Image)

Editor’s note: Bill Hilf is the former CEO of Vulcan/Vale Group, current board chair of Ai2 and American Prairie, and the author of the new sci-fi novel,”The Disruption,” which explores the topics of AI and natural ecosystems. He spoke about the book on the GeekWire Podcast, and elaborates on the themes in this companion essay.

We are building AI at civilizational scale while still talking about it as if it were a software release.

Which model tops which benchmark. Which chatbot sounds most human. Those questions matter, but they’re the wrong altitude. AI systems no longer just answer questions. They mediate hiring, diagnostics, logistics, finance, and growing pieces of public decision-making. We are not shipping products anymore. We are reshaping environments.

At this scale, AI is heavily interconnected. It has linked failure modes. Emergent behavior. Invasive species. Tipping points.

Treating an environment like a product is a category error, and it’s already compounding.

I spent three decades building the systems now at the center of this conversation, from scientific computing at IBM to early Azure and large-scale enterprise systems at HP. The working model was deterministic: specify the system, build it, tune it, control it. If something breaks, diagnose and patch. That model works right up until it doesn’t.

At sufficient scale, distributed systems stop behaving like machines and start behaving more like ecosystems. They adapt. They route around failure. They develop dependencies no one designed and interactions no one completely understands. You can still architect and engineer them. But once they are embedded everywhere, connected to everything, and optimized across too many layers for any one person to hold in mind, they are no longer just tools.

And the curve is steepening. McKinsey’s latest State of AI says 88% of surveyed organizations now use AI in at least one business function, up from 55% two years earlier. Gartner forecasts worldwide software spending above $1.4 trillion in 2026. In investor commentary circulated this year, Thoma Bravo argues that agentic AI could create a roughly $3 trillion incremental application revenue opportunity by converting labor spend into software spend. That is not a feature upgrade. It is the system rewiring itself mid-flight, faster than most firms can govern, audit, or even classify what they have already built.

That realization didn’t come only from technology. It also came from conservation.

Ecology has a name for what happens when you pull out a load-bearing layer too fast: trophic cascade. The Aleutian fur trade nearly wiped out sea otters in the 18th century. Otters eat urchins. Urchins eat kelp. Remove the otters, and you don’t get an otter-shaped hole. You get an urchin explosion, collapsed kelp forests, and the loss of every fish nursery the kelp was quietly holding up. 

That is the pattern we should be watching in AI-dependent infrastructure. The AI will probably be better than your people at screening, scoring, and forecasting. The real problem is the speed. We are replacing the people who were providing judgment, correction, and restraint, the connective tissue that never showed up on a workflow diagram. The voice in the gray areas, the non-computable decisions. Remove that layer faster than the organization can discover what it was holding up, and you get the same cascade.

If we’re serious about building durable AI infrastructure, those patterns are worth studying, and some of the lessons are uncomfortable.

Efficiency is overrated. In technology, as in ecology, a system optimized too tightly becomes brittle. Slack and redundancy matter. So do firebreaks, and so does local autonomy.

In July 2024, a single CrowdStrike configuration update crashed 8.5 million machines worldwide. Airlines, hospitals, 911 centers, banks. $5.4 billion in losses. They reverted the bad update in 78 minutes. The recovery took days. Southwest Airlines was largely unaffected. It simply wasn’t running CrowdStrike’s software. Sometimes the absence of a dependency is its own firebreak. If every important function in your stack depends on one model, one provider, or one training pipeline, you haven’t built an intelligent marvel. You’ve built a future outage.

Ecosystems don’t only fail by cascade. They also fail by accretion. AI is entering workflows the way invasive species enter ecosystems: through low-visibility vectors, one deployment at a time. A copilot here, a summarization layer there, an autonomous scheduler somewhere no one is tracking. Each deployment is defensible on its own. The cumulative effect is something no one chose. The review and friction that kept earlier processes honest were built for human speed. Nothing has replaced them at machine speed.

A model does not remain what it was in the lab once it begins shaping the environment that later shapes it. AI systems do the same when deployed into markets, media, institutions, and human behavior. You do not regulate an ecosystem by inspecting individual organisms. You regulate the conditions that determine whether the whole system recovers or collapses. Those conditions include observability.

Systems that cannot be inspected, studied, or independently evaluated are systems no one can truly understand or govern well. Openness matters here, not as a slogan, but as a requirement for analysis and earned trust. The same logic applies to fault tolerance. Before a model is allowed inside critical systems, its operator should have to prove the full environment can still function without it. That means mandatory degradation testing, the way we stress-test banks and bridges.

Builders don’t have to wait for regulators. If an AI layer is entering a production workflow, builders need to know what happens when the model is wrong, the vendor is down, or the behavior changes after deployment. If the honest answer is “we don’t know,” the layer is not ready to be load-bearing. That’s true for a hospital triage system and for a customer support bot. It is especially true for agents with open-ended scope: software that can plan, call tools, and act inside environments no one fully controls. For those systems, model quality is the easy question. The hard one is who is accountable when it fails.

Multi-agent architectures and ensemble approaches can improve resilience, but only when the diversity is real. Three agents routing to the same foundation model may improve reasoning, but they are not three independent safeguards. They are one dependency wearing three hats.

There’s a broader strategic consequence here. In stable ecosystems, dominant species compound their advantage slowly. Shorten the disturbance cycle and many of those advantages erode before they mature. That is happening to business moats now. When disruption gets radically cheaper, the winning question stops being what you’re building and becomes what still compounds when nothing around you lasts. In real-world deployments, the ‘best’ model loses to the most adaptive system.

Recovery matters as much as prevention. In the conservation work I do, the question is never how to stop change. Disturbance is inevitable. The question is what survives, how quickly a system recovers, and what hidden capacities remain after the shock. We should ask the same of AI-dependent infrastructure. Not just “Is it safe?” but “How does it fail? Who can override it? How far does the failure spread? What grows back after the mistake?”

The thing that breaks, in my experience, is the assumption of control. Real systems do not collapse cleanly and they do not recover cleanly. Some parts fail. Some adapt. Some mutate into things no one intended.

Nature has been running distributed sensing, local response, and recovery for hundreds of millions of years. It has been operating the kind of network we keep trying to invent. Not because forests are conscious or because the planet is an AI, but because the engineering problems are structurally similar: how does a system without central control maintain coherence, adapt to damage, and persist across time?

The question is no longer just what AI systems can do. It is what kind of world they create around themselves, what kind of world they inherit from us, and whether we are wise enough to build systems that we can still steer.

If we take this seriously, a few principles follow. Design for diversity before efficiency. Build for recovery before performance. Keep humans in the loop, not as a compliance measure but as the system’s stewards, its source of judgment, and its memory of why it exists. Insist on openness, at all levels, as the precondition for trust at scale. None of this slows AI down. It’s what keeps AI working the day something fails.

You can switch off a machine.

You have to live within an ecosystem.

Share

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Angry Angry 0
Sad Sad 0
Wow Wow 0