The Self-Improving Company: AI Loops, Not AI Copilots

YC Root Access — Tom Blomfield — Video ID: t-G67yKAHBQ — May 19, 2026


The most important AI-native company will not be a normal hierarchy with copilots sprinkled across every function. It will be a set of recursive learning loops: sensing reality, deciding what to do, taking action through tools, checking quality, and folding the results back into the system so the company improves while people sleep.

That is the shift Tom Blomfield is pushing founders to see. The old company was built like a Roman legion: information flows up through named humans, decisions flow down through spans of control, and management exists because coordination is expensive. AI breaks that assumption. Once a company’s domain knowledge is legible to models, the hierarchy stops being the only way to move intelligence through the organization.

Who Is Tom Blomfield?

Tom Blomfield is a YC General Partner and the co-founder of Monzo, one of the best-known consumer fintech companies in the UK. At YC, he works closely with early-stage founders and sees, in real time, how AI-native companies are changing the relationship between headcount, software, operations, and growth.

His credibility here is practical rather than theoretical. The examples come from live YC internal systems: agents querying YC’s database, monitoring failed queries, opening merge requests, regenerating the YC user manual from recorded office hours, and turning institutional knowledge into an always-updating company brain.

The Roman Legion Is the Wrong Shape for an AI Company

Blomfield starts with an organizational metaphor. Roman legions were designed to project power from Rome across vast territory. They relied on nested hierarchies, consistent spans of control, and named individuals responsible for passing orders down and sending information back up.

Most companies still look like that. Human beings are the conduit. A customer signal enters support, gets summarized by a manager, escalated to product, discussed in planning, translated into engineering work, reviewed, shipped, and then measured. Every layer exists because information needs a carrier.

AI changes the carrier. The company no longer needs to treat humans as the only medium for business knowledge. If the important context is captured, structured, synthesized, and made available to agents, the company can begin acting through software loops rather than managerial chains.

“AI isn’t something you bolt onto the side of a company. It’s not a tool you give to your engineers to make them more productive.”

The copilot frame is too small. Making engineers 20% more productive is useful, but it is just the old operating model with a stronger engine. The deeper opportunity is to redesign the company itself.

The Recursive Self-Improving AI Loop

The core framework is a recursive AI loop. Blomfield breaks it into five parts:

LayerWhat it doesExample
Sensor layerCaptures information from the outside world.Customer emails, support tickets, code changes, cancellations, product telemetry.
Policy / decision layerDefines what the system may do, what needs permission, and what must be logged.Rules for safe automation, escalation, customer communication, or deployment.
Tool layerGives the AI deterministic ways to act.Database queries, calendar access, APIs, code tools, internal workflows.
Quality gateChecks whether outputs are safe and good enough.Evals, safety filters, tests, human review for high-risk actions.
Learning mechanismFeeds failures and outcomes back into the system.Updating skills, adding tools, changing indexes, creating new database views.

When every step can run with minimal human intervention, the system gets better each time it touches reality. This is not an agent as a sidekick. It is an organization that learns from every interaction.

The Holy-Shit Moment at YC

YC began with a straightforward internal agent. A partner could ask when they last had office hours with a company. Then it became more useful: if a company needed introductions to people in petrochemicals, the agent could query YC’s database, use retrieval, and suggest relevant founders.

That still fit the old model: AI as a productivity assistant, making a group partner 20% or 30% more effective.

The breakthrough came when YC put a monitoring agent on top of the system. The monitor watched every query YC employees made. It identified when the agent succeeded and when it failed. When it failed, the monitor asked why: Did the system need a new deterministic tool? A better skills file? A new database view? A new index?

Then the system could write code, open a merge request against the YC codebase, have an agent review it, merge it, and deploy it. The next morning, when a human asked the same kind of query, it worked.

“That’s not just AI making you 20 or 30% more valuable. It is the AI going through this loop to figure out how to self-improve.”

That is the difference between adding AI to a company and building an AI-native company. The first improves the worker. The second improves the system that all workers use.

Self-Optimizing Product and Support

The same loop can apply outside internal knowledge tools. A product analytics agent can identify the point of highest friction in a sales funnel, research best practices, create an A/B test, run it for a week, pick the winner, deploy it, and repeat. The product becomes a self-optimizing loop.

Customer service can work similarly. Suggestions come in continuously. An agent playing the role of a product and technology judgment layer triages them: ignore the ideas that do not fit the roadmap, identify those that do, write the code, deploy it, and notify the customer—all with the human in a supervisory or policy-setting role rather than as the manual coordinator of every step.

The important question becomes: which parts of the company can be turned into loops where tokens replace coordination labor?

Burn Tokens, Not Headcount

Blomfield’s operating implication is blunt: burn tokens, not headcount. YC is seeing companies reach Demo Day with roughly five times more revenue per employee than companies did 18 months earlier. He expects that pattern to continue into Series A and Series B.

The constraint shifts. Instead of asking how many people a company needs, founders will ask how much model usage, context, automation, and supervision they can afford. Measuring individual token usage is a crude and gameable proxy, but Blomfield thinks it points in the right direction. In the current phase, founders should be looking for the people who are “token maxing”—pushing hard on the frontier of what the new intelligence can do.

That does not mean promoting or firing people mechanically based on usage dashboards. It means recognizing that experimentation intensity is now a strategic signal. The people exploring the boundary will discover the new operating model first.

Middle Management Is Over

The Roman-legion company needs middle management because information and coordination are scarce. The AI-native company should not. Blomfield argues that two roles matter most: individual contributors who build or operate, and directly responsible individuals who own outcomes.

Anything important needs a named human, not a committee. But that human does not need a stack of managers whose primary job is to pass status around. AI should increasingly handle the coordination problem.

“I think everyone just has to be an IC now, a builder, an operator.”

This is not a call for chaos. It is a call for accountability plus leverage. The human owns judgment, context, and responsibility. The AI handles more of the summarization, routing, generation, checking, and repeated operational work.

Make the Entire Organization Legible to AI

The prerequisite for all of this is legibility. Blomfield is emphatic: record everything. At YC, partner emails go into the YC database. Slack messages, DMs, and office hours are increasingly recorded. His rule is simple: if it was recorded, it happened to the AI. If it was not recorded, it did not happen to the company’s intelligence.

That is a radical change in how founders should think about operational memory. A great customer conversation that lives only in someone’s head is almost wasted. A founder asking for an introduction in a hallway is not part of the system unless it is captured. The company brain can only learn from the reality it can see.

Raw recording is not enough. The system also needs diarization, aggregation, synthesis, categorization, and breadcrumbs. You cannot throw 100,000 hours of audio into a context window. The recordings have to be compressed into useful knowledge that agents can retrieve and act on.

The YC User Manual as a Living Brain

Blomfield’s clearest artifact example is the YC user manual. Much of it was written five to ten years ago and had grown stale. After YC accumulated roughly 2,000 hours of recorded office hours over three months, Haj used that material to regenerate the manual.

The process was straightforward: categorize the material into areas like fundraising, hiring, and co-founder disputes; synthesize the advice; and write a new manual. By the end of the weekend, YC had a 150-page user manual that Blomfield says was dramatically better than the old one.

The deeper point is not the one-time rewrite. The manual can now update every month. Every new piece of advice can be compared against the existing manual and either incorporated or discarded. It becomes a living brain of YC’s founder advice. Pump that into an AI agent, and a founder can query the combined wisdom of 16 YC partners—if the underlying knowledge was made legible first.

Data Is Precious; Software Is Ephemeral

Blomfield’s view of internal software is refreshingly unsentimental. Codeex 55 is now good enough, he says, to one-shot many simple internal dashboards and workflows to a high level of quality. Internal operations teams should sit on a layer of intelligence and generate the dashboards or tools they need on demand.

The data, emails, DMs, skills, and know-how are precious. The software on top is disposable. Save the context carefully, but do not over-preserve the interface. Models will improve in a month or two. Throw the software away, give the model the original instructions and business context, and regenerate it.

This is one of the most founder-relevant implications. Historically, companies accumulated internal tools as sediment. In an AI-native company, the durable asset is not the dashboard. It is the context that lets you recreate a better dashboard whenever you need it.

What Humans Are For

The picture is not human-free. Blomfield describes the company brain in the middle: all the data, emails, DMs, skills, and operational know-how. Humans sit around the edge, where the intelligence makes contact with reality.

Humans are still needed for places models cannot fully go: novel situations, ethical considerations, high-stakes moments, emotionally complex founder conversations, and sales conversations where trust and context matter. The human role becomes less about moving information through the hierarchy and more about touching the world, exercising judgment, and deciding what loops should exist.

Key Lessons

Why This Matters for Diffie

For Anand and Diffie, this is close to the center of the product opportunity. Diffie already sits inside a natural recursive loop: observe frontend changes, run browser-level checks, detect broken user journeys, explain the failure, suggest or generate a fix, verify the fix, and learn from every false positive, false negative, and accepted suggestion.

The strategic move is to present Diffie not merely as “AI browser testing,” but as a self-improving quality loop for frontend teams. The current market mental model is still copilot-shaped: help an engineer write tests faster, or help QA automate a workflow. Blomfield’s framing suggests a stronger category: Diffie becomes the sensor, policy, tool, quality gate, and learning mechanism for UI reliability.

A useful product frame: every PR teaches Diffie. Every failed journey, approved visual diff, ignored alert, flaky check, customer-reported bug, and production regression should improve the next run. The value is not just catching today’s issue; it is building the frontend team’s living quality brain.

That has direct GTM implications. ICP discovery should look for companies where frontend knowledge is trapped in humans and rituals: senior engineers who “just know” what to test, QA leads who carry release risk in their heads, support tickets that never become regression checks, and design-system changes that break downstream flows. Those companies are still operating like Roman legions. Diffie can give them a loop.

Outbound should therefore be concrete and loop-oriented. Instead of pitching generic AI testing, name the missing loop: “Your support tickets mention checkout regressions, but they do not appear to become browser checks,” or “Your design-system velocity is high, but your PR process does not seem to preserve journey-level UI knowledge.” The sharper the observed sensor input, the more credible the automation story.

Internally, Diffie should dogfood the same principle. Record customer calls, support interactions, failed demos, onboarding friction, and product usage. Synthesize them into a living ICP and objections manual. Regenerate sales collateral, onboarding flows, and test heuristics from that knowledge. Preserve the data and context; let the internal tools be disposable.

The company that wins AI browser testing will likely be the one whose product improves while customers sleep. That is the bar Blomfield is setting: not a smarter test runner, but a self-improving company brain for frontend quality.