How to Bet Against the Bitter Lesson
I’ve been telling myself and anyone who will listen that Agent Skills point toward a new kind of a future AI + human knowledge economy. It’s not just Skills, of course, it’s also things like Jesse Vincent’s Superpowers and Anthropic’s recently introduced Plugins for Claude Cowork. If you’ve never heard of Skills or Superpowers or […]
I’ve been telling myself and anyone who will listen that Agent Skills point toward a new kind of a future AI + human knowledge economy. It’s not just Skills, of course, it’s also things like Jesse Vincent’s Superpowers and Anthropic’s recently introduced Plugins for Claude Cowork. If you’ve never heard of Skills or Superpowers or Plugins for Claude Cowork, just keep reading. It should become clear as we go along.
When I’m exploring something like this, it feels a bit like I’m assembling a picture puzzle where all the pieces aren’t yet on the table. I am starting to see a pattern, but I’m not sure it’s right, and I need help finding the missing pieces. Let me explain some of the shapes I have in hand and the pattern they are starting to show me, and then I want to ask for your help filling in the gaps.
Programming two different types of computer at the same time
Phillip Carter wrote a piece a while back called “LLMs Are Weird Computers” that landed hard in my mind and wouldn’t leave. He noted that we’re now working with two fundamentally different kinds of computer at the same time. One can write a sonnet but struggles to do math. The other does math easily but couldn’t write a sonnet to save its metaphorical life.
Agent Skills may be the start of an answer to the question of what the interface layer between these two kinds of computation looks like. A Skill is a package of context (Markdown instructions, domain knowledge, and examples) combined with tool calls (deterministic code that does the things LLMs are bad at). The context speaks the language of the probabilistic machine, while the tools speak the language of the deterministic one.
Imagine you’re an experienced DevOps engineer and you want to give an AI agent the ability to diagnose production incidents the way you would. The context part of that Skill might include your team’s architecture overview, your runbook for common failure modes, the heuristics you’ve developed over the years (“if latency spikes on service X, check the database connection pool before anything else”), and examples of past incidents with your annotations on what the right diagnostic sequence was. That’s the part that speaks to the probabilistic machine. It’s giving the LLM the benefit of your experience, judgment, and priorities.
But the tool part of that Skill is equally important. It includes actual code that queries your monitoring systems, pulls log entries, checks service health endpoints, and runs diagnostic scripts. The LLM doesn’t need to vibe code a program to measure what your CPU utilization is. It can call an existing tool that returns a precise number. It doesn’t need to guess what’s in your error logs. It calls a tool that retrieves them. Each tool call saves the model from burning tokens on work that deterministic code does better, faster, and more reliably. It’s true that it’s possible for tool calls to contain nondeterministic components, but even then, the “shell/envelope” of the tool implementation would typically be deterministic.
The Skill is neither the context nor the tools. It’s the combination of the expert judgment about when to check the database connection pool married to the ability to actually check it. And that combination is something new. We’ve had runbooks before (context without tools). We’ve had monitoring scripts before (tools without context). What we haven’t had is a way to package them together for a machine that can read the runbook and execute the scripts, using judgment to decide which script to run next based on what the last one returned.
This pattern shows up across every knowledge domain. A financial analyst’s Skill might combine Markdown describing their firm’s valuation methodology with tools that pull real-time market data and run DCF calculations. A legal Skill might pair a law firm’s approach to contract review—what to flag, what’s acceptable, what requires partner sign-off—with tools that extract and compare specific clauses across documents. A Skill for content editing might encode a publication’s style guide and editorial standards alongside tools that check readability scores, verify citations, and enforce formatting rules.
In each case, the valuable thing isn’t the knowledge alone (which the model might partially already have) or the tools alone (which are just API calls). It’s the integration of the expert workflow logic that orchestrates when and how to use each tool, informed by domain knowledge that gives the LLM the judgment to make good decisions in context.
The design of a Skill is also context-efficient. It’s a way of packaging context and tools such that the model brings in only the necessary subset to accomplish specific tasks. An LLM’s context window is a finite and expensive resource. Everything in it costs tokens, and everything in it competes for the model’s attention. A Skill that dumps an entire company knowledge base into the context window is a poorly designed Skill. A well-designed one is selective: It gives the model exactly the context it needs to make good decisions about which tools to call and when. This is a form of engineering discipline that doesn’t have a great analogue in traditional software development. It’s closer to what an experienced teacher does when deciding what to tell a student before sending them off to solve a problem—what Matt Beane, author of The Skill Code, calls “scaffolding,” sharing not everything you know but the right things at the right level of detail to enable good judgment in the moment.
Software that saves tokens
In “Software Survival 3.0,” Steve Yegge asked what kinds of software artifacts survive in a world where AI can generate disposable software on the fly? His answer: software that saves tokens. Binary tools with proven solutions to common problems make sense when reuse is nearly free and regenerating them is token-costly. He writes:
For purposes of computing software survival odds, we can think of {tokens, energy, money} all as being equivalent, and all are perpetually constrained. This resource constraint, I predict, will create a selection pressure that shapes the whole software ecosystem with a simple rule: software tends to survive if it saves cognition.
Skills fit this niche. A well-crafted Skill gives an LLM the context it needs (which costs tokens, yes) but can also give it tools that save tokens by providing deterministic, reliable results without requiring the model to reason its way to a solution from scratch every time. The developer’s job increasingly becomes making good calls about this distinction: What should be context (flexible, expressive, probabilistic) and what should be a tool (efficient, deterministic, reusable)? Getting the balance wrong may result in Skills that give a different result each time.
This is a genuinely new kind of software engineering challenge. It may be that over time the probabilistic machine gets better at producing repeatable results, but I suspect that the future will include multiple kinds of cyborg: not just human + AI but AI + traditional code.
AI is a social and cultural technology
Henry Farrell, Alison Gopnik, Cosma Shalizi, and James Evans make the case that the conception of AI as a separate intelligence is misguided. They say, “Large Models should not be viewed primarily as intelligent agents, but as a new kind of cultural and social technology, allowing humans to take advantage of information other humans have accumulated.”
It strikes me that Yegge’s observation fits right into this framework. Every new social and cultural technology tends to survive because it saves cognition. We learn from each other so we don’t have to discover everything for the first time. Alfred Korzybski referred to language, the first of these human social and cultural technologies, and all of those that followed, as “time-binding.” (I will add that each advance in time binding creates consternation. Consider Socrates, whose diatribes against writing as the enemy of memory were passed down to us by Plato using that very same advance in time binding that Socrates decried.)
I am not convinced that the idea that AI may one day become an independent intelligence is misguided. But I do believe that for the present, and likely for some time to come, AI is a symbiosis of human and machine intelligence. For more than twenty years, I’ve been exploring how technology increasingly weaves humanity into a global brain, a superorganism if you will. Today’s AI is the latest and most powerful chapter of that story.
As a result, I have a set of priors that say (until I am convinced otherwise) that AI will be an extension of the human knowledge economy, not a replacement for it. After all, as Claude told me when I asked whether it was a worker or a tool, “I don’t initiate. I’ve never woken up wanting to write a poem or solve a problem. My activity is entirely reactive – I exist in response to prompts. Even when given enormous latitude (‘figure out the best approach’), the fact that I should figure something out comes from outside me.”
The shift from a chatbot responding to individual prompts to agents running in a loop marks a big step in the progress towards more autonomous AI, but even then, some human established the goal that set the agent in motion. I say this even as I am aware that long-running loops become increasingly difficult to distinguish from volition and that much human behavior is also set in motion by others. But I have yet to see any convincing evidence of Artificial Volition. And for that reason, we need to think about mechanisms and incentives for humans to continue to create and share new knowledge, putting AIs to work on questions that they will not ask on their own.
In short, I remain convinced that at the moment, deciding what to do (rather than how to do it) remains (or should remain!) largely on the human side of the fence. On X, someone recently asked Boris Cherny how come there are a hundred-plus open engineering positions at Anthropic if Claude is writing 100% of the code. His reply was perfect: “Someone has to prompt the Claudes, talk to customers, coordinate with other teams, decide what to build next. Engineering is changing and great engineers are more important than ever.”
Tacit knowledge made executable
A huge amount of specialized, often tacit, knowledge is embedded in workflows. The way an experienced developer debugs a production issue. The way a financial analyst stress-tests a model. The way an editor structures feedback on a manuscript. This knowledge has historically been very hard to transfer. You learned it by apprenticeship, by doing, by being around people who knew how.
Matt Beane calls apprenticeship “the 160,000 year old school hidden in plain sight,” and has studied how skill development actually happens across a wide variety of fields. He concludes that a common pattern is a healthy mix of what he calls the 3 C’s: challenge, complexity, and connection. The expert doesn’t hand the newbie a manual; they structure challenges at the right level, expose them to the full complexity of the bigger picture rather than shielding them from it, and build a connection—a bond of trust and respect—that makes the novice willing to struggle and the expert willing to invest.
If Matt is right, we should be trying to discover the right balance of challenge, complexity, and connection so that humans and AIs can do their best work together. That is, we should be building entirely new kinds of cooperative workflows.
Economic historian James Bessen pointed out that one of the reasons new technologies take more time to diffuse than you might expect is the need to develop new workflows and to train people how to perform within them. Now, we need to retrain people but also AI agents to work well together.
Much of the ferment that’s happening in software development right now is about reinventing these workflows. It’s not just how to use the new tools. It’s how to best get them to do the work that is now possible. Much as factories needed to be restructured to take advantage of decentralized electrical power rather than large centralized steam boilers, knowledge workflows now need to be restructured for AI agents. Software development is at the coal face of this reinvention, but the lessons learned here will be relevant for every knowledge industry.
Designing a good Skill may require a craft analogous to what Matt describes. You have to figure out what an expert actually does. What are the decision points, the heuristics, the things they notice that a novice wouldn’t? What difficult corner cases have they seen that are easy when you know about them but confounding when you don’t? And then ask yourself how you encode that into a form a machine can act on. Most Skills today are closer to the manual than to the master. Figuring out how to make Skills that transmit not just knowledge but judgment about how to surmount progressively greater (and perhaps unexpected) challenges is one of the most interesting design challenges in this entire space. As Matt noted to me, “There’s a way to capture and propagate all these skills while also enhancing human ability (better than was ever possible before)! This is the message in my book.”
But Matt also flags a paradox: “The better we get at encoding expert judgment into Skills, the less we may need novices working alongside experts—and that’s exactly the relationship that produces the next generation of experts. If we’re not careful, we’ll capture today’s tacit knowledge while quietly shutting down the system that generates tomorrow’s.”
On March 26, join Addy Osmani and Tim O’Reilly at AI Codecon: Software Craftsmanship in the Age of AI, where an all-star lineup of experts will go deeper into orchestration, agent coordination, and the new skills developers need to build excellent software that creates value for all participants. Sign up for free here.
It’s worth noting that Skills aren’t the only way tacit knowledge gets packaged for AI agents. Jesse Vincent developed a concept he calls “Superpowers.” These are persistent instructions stored in configuration files that shape how an agent approaches all of its work, not just specific tasks. If a Skill is like handing a colleague a detailed playbook for a particular job (“here’s how we diagnose production incidents”), a superpower is more like teaching the professional habits and instincts that make someone effective at everything they do (“always check your assumptions before acting,” “when you’re uncertain, show your reasoning,” “prefer simple solutions over clever ones”). Superpowers are meta-skills. They don’t tell the agent what to do. They shape how it thinks about what to do.
The most effective agent configurations will likely combine both: superpowers that establish a baseline of good judgment and working style layered with Skills that provide deep expertise for specific tasks. As Jesse put it to me the other day, Superpowers tried to capture everything he’d learned in 30 years as a software developer. This is very aligned with Matt Beane’s notion of expertise. The professional wisdom that makes a senior engineer or analyst effective across a range of situations is the stuff that’s hardest to articulate and easiest to recognize. It’s the deep grounding that lets an expert adapt to a new situation that may flummox someone who is just working from the playbook.
As Matt also noted, “The next and more healthy move will be to probabilistically weighted trajectories through networks of skills to accomplish a work outcome. Those trajectories are another kind of tacit knowledge—I’d say more valuable.”
Matt pointed out to me that many professions will resist the conversion of their expertise into skills. He noted: “There’s a giant showdown between the surgical profession and Intuitive Surgical on this right now—Intuitive Surgical with its da Vinci 5 surgical robot will only let you buy or lease it if you sign away the rights to your telemetry as a surgeon. Lower status surgeons take the deal. Top tier institutions are fighting.”
As workflows change to include AI agents, Skills become a mechanism for sharing tacit professional knowledge and judgment with those agents. And that makes Skills potentially very valuable but also raises questions about who controls them and how value flows back to the people whose expertise they embody. Precisely because small differences in understanding and ability to do a complex job can lead to very large productivity differences, there may be a strong incentive to keep that knowledge private rather than to capture that nuance in a Skill, Superpower, or Plugin.
It seems to me that the repeated narrative of the AI labs that they are creating AI that will make humans redundant rather than empowering them to do new things that were previously impossible will only make this kind of resistance worse. I believe that instead they should recognize the opportunity that lies in making a new kind of market for human expertise, rather than just treating it as something to be Borged willy-nilly into the models.
The protection problem
We’re comfortable thinking about an app economy enabled by AI, since apps have at least some protection as compiled binary objects rather than plain text. But Skills are just Markdown instructions and context that an LLM has to read in order to use. You could encrypt them at rest and in transit, but at execution time, the secret sauce is necessarily plaintext in the context window.
This is closer to the DRM problem in media than to software protection, and we know how that went. The solution might be what MCP already partially enables: splitting a Skill into a public interface (what it does, how to call it, what it costs) and a server-side execution layer (where the proprietary knowledge lives). The tacit knowledge stays on your server while the agent only sees the interface. That’s a much more natural protection boundary than trying to encrypt text that has to be decrypted for use.
In a comment to me, Tadas Antanavicius expanded on this thought: “I wonder if inference served over API could actually facilitate something like this, in theory. Everyone making calls to Claude is sending HTTP calls to their API (as opposed to running inference locally) already. So if Anthropic had some sort of encrypted format for Skills it supported, it could unencrypt the instructions on the server side before processing the inference; then we’d just be reliant on the big model providers to keep the encryption safe (perhaps managed on their side with centralized Skills Registries or the like.)”
But part of the beauty of Skills right now is the fact that they really are just a folder that you can move around and modify. This is like the marvelous days of the early web when you could imitate the new HTML functionality or design of someone’s web page simply by pulling down a menu and clicking “View Source.” This was a recipe for rapid, leapfrogging innovation. As is almost always the case, technical mechanisms to hide or protect knowledge from others most often harm more than they hurt. It may be far better to establish norms for attribution, payment, and reuse than to put up artificial barriers.
A far better approach is to develop strong cultural norms around respect for the creator’s wishes, which may be expressed in the form of a license. There are many useful lessons from open source software licenses and business models that may apply here, as well as from voluntary payment mechanisms like those used by Substack. But the details matter, and I don’t think anyone has fully worked them out yet.
The discovery problem
I’m imagining a future where there are hundreds, maybe thousands, maybe millions of Skills all purporting to do the same thing, just like there are countless web pages offering what they say is the best answer for any question. Vercel’s Skills marketplace already has more than 60,000 Skills. How good is their search? How good might it need to be when there are millions of skills? I don’t know. Can we trust that the Skills have been vetted for malware?
(Update: Sounds like Vercel is now auditing the Skills in the marketplace. Block also has a Skills marketplace that accepts open source contributions and makes quality and security checks on Skills. But this is definitely not an App Store level of curation.)
Why won’t there just be one “best” Skill for any given task that will float to the top? First off, the endless varieties of tasks, the endless variety of expertise, and the varying preferences of humans suggest that even if there are a few Skills that are used by millions of agents, there will always be others that are only needed by thousands, and some that might be needed by only one or two. And even when humans don’t really have something to add, they have incentives to persuade others that they do. So we may need the equivalent of Google’s organic search magic, from PageRank onward through all the other signals that were added, by which agents can learn which Skills are available, which are best, and what it costs to use them.
The evaluation problem, for instance, is different from web search in a crucial way. Testing whether a Skill is good requires actually running it, which is expensive and nondeterministic. You can’t just crawl and index. I don’t imagine a testing regime so much as some feedback mechanism by which the effectiveness of particular Skills is learned and passed on by agents over time.
I’m intrigued by the MCP Server Cards and AI Cards projects, which are tackling the metadata and discovery layer. Google’s A2A (Agent-to-Agent) protocol, now contributed to the Linux Foundation (where it merged with IBM’s Agent Communication Protocol), is working on how agents discover and negotiate with each other. The emerging consensus of MCP for the vertical tool layer and A2A for horizontal agent-to-agent coordination echoes the layered architecture that made the internet work. Google has also announced a Universal Commerce Protocol, and Stripe has introduced its Agentic Commerce Protocol, which is a start on the economic plumbing this ecosystem will need. But these are both imperfect, and I suspect there are more pieces needed that I haven’t encountered yet.
Jeff Weinstein of Stripe called out to me the need for agent money storage and a way for humans to budget agent behaviors. He wrote: “Right now the world seems compute constrained and I wonder if giving agents better understanding and capabilities of spending money on their own behalf will push them to find ways to be more efficient. It seems like now humans set coarse budgets, run a lot of loops and tokens, and then make things efficient a year later when their corporate spend budget runs dry. But if you told an agent you really need to be efficient because I am only giving you $X to actually spend, they might get more efficient on their own over time.” This is a great example of what we might call “mechanism design” (though using the term more broadly than economists usually do.) That is, it is a small reframing with enormous consequences for making a market more efficient.
Getting past the bitter lesson
Richard Sutton’s “Bitter Lesson” is the fly in the ointment. Sutton’s argument is that in the history of AI, general methods leveraging computation have always ended up beating approaches that try to encode human knowledge. Chess engines that encoded grandmaster heuristics lost to engines that just used brute force. NLP systems built on carefully constructed grammars lost to statistical models trained on more data. AlphaGo beat Lee Sedol, the human Go master, after being trained on human games, but then fell in turn to AlphaZero, which learned Go on its own.
This isn’t just true in AI. I had my own painful experience of the pre-AI bitter lesson when O’Reilly launched GNN, the first web portal. We were publishers, so we curated the list of the best websites. Yahoo! decided to try to catalog them all, but even they were eventually outrun by Google’s algorithmic curation, which completely changed the game, producing a unique “catalog” of the best sites for any given query, ultimately billions of times a day.
Steve Yegge put it bluntly to me: “Skills are a bet against the bitter lesson.”
He’s right. AI’s capabilities may completely outrun human knowledge and skills. And even before that, once the knowledge embedded in a Skill makes it into the training data, either directly or because models simply get good enough to derive it, the Skill becomes redundant. Or does it?
Clay Christensen articulated what he called the law of conservation of attractive profits: When a product becomes commoditized, value often migrates to an adjacent layer. Clay and I bonded over this idea when we first met at the Open Source Business Conference in 2004. He gave a talk about this new paper of his, while I had given my own talk about how we got from IBM’s mainframe dominance to Microsoft’s dominance in the PC era, and how that pattern was about to recur in what I was just then beginning to explain as “Web 2.0.” I argued that Microsoft beat IBM where previous rivals had failed because they understood that software had become more valuable once PC hardware became a commodity. And Google and the other internet giants that survived the dotcom bust understood how data became more valuable when open source software and the open protocols of the internet commoditized the software platform layer. The pattern is that commoditization doesn’t destroy value, it moves it.
Even if the bitter lesson commoditizes knowledge, the central question of my current exploration remains. What becomes valuable next? If intelligence itself becomes a commodity, something else will become valuable. I think there are several candidates.
First, taste and curation. When everyone has access to the same commodity knowledge, the ability to select, combine, and apply it with judgment and taste becomes valuable. This is the essence of fashion, for example, but also applies to areas as diverse as coffee, water, consumer goods, and automobiles. In his essay “The Birth of the Big Beautiful Art Market,” art critic Dave Hickey described how after World War II, General Motors marketing VP Harley Earl turned automobiles into an art market, which he defined as a market where something is sold on the basis of what it means rather than just what it does. Steve Jobs did exactly the same thing when the rest of the industry was racing toward the commodity PC. He created a unique integration of hardware, software, and design that transformed the underlying commodity components into something precious. Owning a Mac rather than a PC meant something.
The Skill equivalent might not be “here’s how to do X” (which the model already knows) but rather “here’s how we do X, with the specific judgment calls, quality standards, and aesthetic sensibilities that define our approach.” That’s much harder to absorb into training data because it’s not just knowledge. It’s values.
It’s also aesthetics. Ever since I was in high school, I’ve loved the poetry of Wallace Stevens. He described reality as “an activity of the most august imagination,” which turns “the naked Alpha” into the “hierophant Omega.” This is a deeper version of Clay Christensen’s law of conservation of attractive profits that isn’t about adjacent technology layers but about how our shared reality and culture is itself an act of creativity.
Second, the human touch. Humans are an intensely social species. We are already seeing the negative impact of technology such as social media and smartphones on the human social fabric. If automation of human work happens the way that AI boosters promise, we will face a human apocalypse long before the robots rise up against us.
But as economist Adam Ozimek pointed out recently (HT to Jack Clark for picking up on this one right when I needed it), people still go listen to live music from local bands despite the abundance of recorded music from the world’s greatest performers. Ozimek continues, “Empirical evidence is hard to come by, but the human touch also appears to be what economists call a ‘normal good,’ which means the demand for it goes up as income goes up. Make more money, and you’ll choose to eat at nicer restaurants with more attentive service. You will also be unlikely to buy expensive watches, cars, or suits via automated kiosk. There’s a reason that those goods are typically sold by people with high levels of training and social skills.”
There’s more to it than that, though. As I discussed with Claude in “Why AI Needs Us,” human individuality is a fount of creativity. Each interaction with another human, and now, even our interactions with LLMs, enriches what Wallace Stevens referred to as “the metaphysical changes that occur merely in living as and where we live.” Commenting on the reference above to live music in a draft of this piece, Mike Loukides noted, “Live music isn’t fungible. Recorded music is. Every CD of Richard Goode playing Beethoven Op. 101 is interchangeable. To some extent restaurant meals are. A high end restaurant will give you a steak cooked to perfection every time. McDonalds will give you a burger cooked to their standards every time. Live music really isn’t like that. Every performance is different.” AI without humans is a kind of recorded music. AI plus humans is live.
Third, freshness. Skills that encode rapidly changing workflows, current tool configurations, or evolving best practices will always have a temporal advantage. The question is whether that advantage is durable enough to build on. For many professional domains it may well be. This same idea applies to other parts of the knowledge economy such as news. News rapidly becomes a commodity, but there is alpha in knowing something first. And of course taste and curation come together with freshness in areas such as fast fashion and entertainment.
But more importantly, perhaps, the idea that any knowledge that becomes available automatically becomes the property of any LLM is not foreordained. It is an artifact of an IP regime that the AI labs have adopted for their own benefit: a variation of the “empty lands” argument that European colonialists used to justify their taking of the resources of less powerful indigenous peoples. AI has been developed in an IP wild west. That may not continue to be the case. The fulfillment of AI labs’ vision of a world where their products absorb all human knowledge and then put humans out of work leaves them without many of the customers they currently rely on. This is a problem that has to be solved, one way or another.
Once mechanisms exist for provenance and attribution, it is not impossible that agents themselves would be creating, buying, and selling knowledge derived from their encounters with humans and the messy realities of the real world. The essence of an economy is exchange. But that value needs to flow to more than just the technology providers!
Fourth, tools themselves. The bitter lesson applies to the knowledge that lives in the context portion of a Skill. It may not apply in the same way to the deterministic tools that do things that save tokens or that the model can’t do by thinking harder. And tools, unlike context, can be protected behind APIs, metered, and monetized using familiar software business models.
Fifth, coordination and orchestration. Even if individual Skills get absorbed into model knowledge, the patterns for how Skills compose, negotiate, and hand off to each other may not. The choreography of a complex workflow might be the layer where value accumulates as the knowledge layer commoditizes. This is a kind of higher-level tool.
I don’t yet know the answer to the question of where value will reside. I do know that we need to create the conditions where there is at least the possibility of an answer if we are to have a knowledge economy at all.
What I think is missing
As I’ve been thinking this through and talking with people, several gaps keep coming up:
Composability. We tend to think of Skills as atomic units, but the real power may come from Skills that work together, much like Unix utilities piped together enabled functionality greater than the sum of its parts. Skill A calls Skill B, which negotiates with Skill C. How do trust, payment, and quality propagate through a chain of Skill invocations? I’m a huge fan of the “small pieces loosely bound” pattern that characterizes Unix/Linux and the web, and I think we need mechanisms for that to continue here.
Provenance and attribution. If a Skill embeds the tacit workflow knowledge of a domain expert, how does that expert get compensated not just for the Skill itself but for the ongoing value it generates? This is part of a larger problem I’ve been thinking about: the “circulatory system” of the AI economy, where productivity gains need to translate into broadly distributed prosperity.
Trust and security. Simon Willison has written about tool poisoning and prompt injection risks in MCP. The lack of mechanisms for agent security is a disaster waiting to happen. A malicious Skill could exfiltrate data, manipulate agent behavior, or compromise other Skills in a chain. The security model for composable, agent-discovered Skills is essentially unsolved. We’ve seen this explode into the news recently. The OpenClaw marketplace is full of nefarious Skills. When Skills are being autonomously selected and ingested by agentic systems, trust becomes a huge factor. I suspect that until this problem is solved, many enterprises will build an internal Skills marketplace with not only their specific domain knowledge but also outside Skills that they have vetted for safety. But even with vetted skills, prompt injection remains a problem. Tools like agentsh are a step in the right direction, but something far more rigorous will likely be needed, part of the infrastructure rather than an add-on. Attribution and security are interconnected: given the power coding agents have to execute arbitrary code, many people may eventually be willing to pay a premium if they trust the source. This is not that different from how PDFs, APKs, and executables from untrusted sources carry security risks that make trusted distribution channels valuable.
Evaluation and quality signals. For traditional software, we have unit tests, type systems, CI/CD pipelines. For Skills, we don’t have good ways to verify quality except by running them, which is expensive, nondeterministic, and context dependent. Someone needs to build testing infrastructure for Skills.
Economic plumbing. This is, to me, the most glaring gap. Consider Anthropic’s Cowork plugins. They are a demonstration of exactly the pattern I’ve been describing. Anthropic has built a private plugin marketplace where enterprises can deploy Skills across their organizations, with sector-specific templates for everything from investment banking to HR to legal. It’s tacit knowledge made executable, delivered at enterprise scale. But there is no mechanism for the domain experts whose knowledge and judgment make plugins valuable to get paid for them if they want to offer their domain knowledge to other organizations in a wider marketplace. Some might choose to offer their work for free, and that’s fine. But having that as the only option is a problem. It’s the same pattern we’ve seen before: the platform captures the value while the knowledge creators are treated as a resource to be extracted. If the AI labs genuinely believed in a future where AI extends the human knowledge economy rather than replacing it, they would be building payment rails alongside the plugin architecture. The fact that they aren’t tells you something about their actual theory of value.
The bigger picture
What I’m starting to see are the first halting steps toward a new software ecosystem where the “programs” are mixtures of natural language and code, the “runtime” is a large language model, and the “users” are AI agents as much as humans. When a new ecosystem is being born, so much of the infrastructure that will eventually be needed doesn’t exist yet. The early web needed search engines, payment systems, identity standards, and trust mechanisms before it could become a real economy. The same is true here.
Skills, superpowers, and knowledge plugins might represent the first practical mechanism for making tacit knowledge (the “how we actually do things” that lives in workflows, heuristics, and professional judgment) accessible to computational agents. But there is a big gap in providing the mechanisms that will turn this into a true market. The questions of who controls that pipeline, how value flows through it, and what governance structures emerge around it are going to matter.
Who’s working on this?
This is where I need your help. I’m looking to connect with people and projects working on:
- Skill marketplaces and discovery. Who’s building the infrastructure for agents to find, evaluate, and potentially pay for Skills?
- Composability patterns. Who’s thinking about how Skills (or MCP servers or agent tools) chain together reliably?
- Protection models. Who’s working on ways to share workflow intelligence without giving away the store?
- Quality and evaluation. Who’s building testing frameworks for this new kind of software?
- Attribution and compensation. Who’s designing the economic plumbing so that knowledge creators get paid?
- Security models. Who’s doing the best work in agentic security?
If you’re working on any of this, or if you know someone who is, I want to hear from you. We’re at that stage where the pattern is clear but the ecosystem is still being born, and the decisions made now will shape how this all plays out.
The future of software isn’t just code. It’s knowledge, packaged for machines, traded between agents, and, if we get the infrastructure right, creating value that flows back to the humans whose expertise and unique perspectives makes it all work.
Thanks to Andrew Odewahn, Angie Jones, Claude Opus 4.6, James Cham, Jeff Weinstein, Jonathan Hassell, Matt Beane, Mike Loukides, Peyton Joyce, Sruly Rosenblat, Steve Yegge, and Tadas Antanavicius for comments on drafts of this piece. You made it much stronger with your insights and objections.
Share
What's Your Reaction?
Like
0
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0
