Skip to content
Go back

The Physics of Flow

Published:  at  10:00 AM
The Firm Under AI

Rethinking corporations, platforms, and power when intelligence becomes infrastructure

13 of 13

The Physics of Flow

In the previous post we introduced the protocol layer — the coordination mechanism that allows networked firms to govern interactions without hierarchy.

Protocols define the rules. APIs, workflows, event systems, validation logic.

But rules alone do not explain what happens when work actually moves through the network.

Work arrives. It waits. It gets processed. It moves on.

Sometimes it flows smoothly.

Often it does not.

The protocol layer defines the rules of interaction. Queuing theory describes what happens when those interactions execute under load.

To understand why organisations slow down as they scale — why adding people doesn’t proportionally increase output, why some firms feel fast while others feel stuck — we need to look at the physics underneath.

That physics is queuing theory.


Erlang’s Problem

In 1909, a Danish mathematician named Agner Krarup Erlang was asked a simple question by the Copenhagen Telephone Exchange:

How many operators do we need to handle calls without unacceptable delays?

The question seems straightforward. More calls, more operators.

But Erlang discovered something non-obvious.

The relationship between capacity, demand, and waiting time is not linear. It is governed by probability distributions and exponential dynamics that produce deeply counterintuitive behaviour.

His work founded an entire field — queuing theory — and the insights apply far beyond telephone exchanges.

Every organisation that routes work through people, teams, or systems faces Erlang’s problem.


Three Laws of Flow

Queuing theory gives us three foundational results. Each is simple to state. Together they explain most of what goes wrong in organisational coordination.

Little’s Law

  • L = average number of items in the system (work in progress)
  • λ = average arrival rate (throughput)
  • W = average time an item spends in the system (lead time)

Rearranged: W = L / λ

Lead time equals work in progress divided by throughput.

This means that if you want to reduce lead time, you have two options: increase throughput or reduce work in progress.

Most organisations try to increase throughput.

The more effective lever is usually reducing WIP.

Kingman’s Formula (The VUT Equation)

  • V = variability factor (how unpredictable work is)
  • U = utilisation factor = ρ / (1 - ρ)
  • T = mean processing time

The critical term is U.

As utilisation (ρ) approaches 1, the denominator approaches zero.

Wait time does not grow linearly with utilisation.

It grows hyperbolically.

The Utilisation Curve

This is the single most important diagram in organisational design.

UtilisationWait Time Multiplier
50%
70%2.3×
80%
85%5.7×
90%
95%19×
99%99×

At 50% utilisation, the system is responsive.

At 80%, wait times have quadrupled.

At 95%, they have increased nineteen-fold.

At 100%, the queue grows without bound.

The utilisation trap: the pursuit of full efficiency destroys responsiveness.


The Invisible Queue

In a factory, queues are visible. You can see inventory piling up on the floor.

In knowledge work, queues are invisible.

They hide inside:

  • ticket backlogs
  • email inboxes
  • review queues
  • approval chains
  • dependency requests
  • meeting calendars

Donald Reinertsen, who systematically applied queuing theory to product development, found something striking:

“98 percent of product developers do not know the size of the queues in their development processes.”

Most organisations operate at 5-10% Process Cycle Efficiency — meaning 90-95% of elapsed time is spent waiting, not working.

A feature requiring 106 hours of actual work can take 38 weeks to deliver.

The work itself is not slow.

The waiting is.


Every Boundary Is a Queue

This is where queuing theory connects directly to organisational structure.

Every time work crosses a boundary — between teams, between systems, between approval layers — it enters a queue.

Each queue adds wait time.

Each wait time is governed by the utilisation curve.

If Team B is running at 90% utilisation, work from Team A waits an average of 9× longer than the actual processing time.

Three hand-offs through teams at 85% utilisation creates total wait time of roughly 17× the processing time.

Organisational boundaries are not just coordination costs. They are queuing systems with exponential penalty functions.

This reframes Brooks’s Law. Adding people to a project doesn’t just add communication overhead. It adds hand-off queues. Each new dependency is a new queue. Each new queue multiplies wait time non-linearly.


The Hierarchy as a Queuing Network

In Post 9 we described the traditional firm as a tree. In Post 10 we showed how coordination density determines organisational performance.

Queuing theory reveals what that tree actually looks like from a flow perspective.

Every manager is a single-server queue.

Requests from multiple reports compete for the manager’s attention. The manager processes them one at a time. When the manager is busy, requests wait.

The deeper the hierarchy, the more queues in series.

Each queue multiplies latency.

This is why hierarchical organisations feel slow even when individual managers work hard. The managers are not the problem. The queuing network is.


The Constraint

Eliyahu Goldratt, in The Goal (1984), identified a complementary principle.

Every system has at least one constraint that limits its throughput.

His Theory of Constraints offers a powerful lens:

  1. Identify the constraint
  2. Exploit it — ensure the constraint is never idle
  3. Subordinate everything else — align all other work to the constraint’s rhythm
  4. Elevate the constraint — invest in increasing its capacity
  5. Repeat — find the new constraint

The critical insight: optimising anything other than the constraint produces no system-level improvement.

In queuing terms, adding capacity to a non-bottleneck queue does not reduce total cycle time. It just moves work faster to the next queue — where it waits.

This explains something organisations experience constantly.

A team is restructured. New people are hired. Processes are optimised.

And nothing gets faster.

Because the constraint was elsewhere.


Cost of Delay

Reinertsen introduced another concept that bridges queuing theory to economics.

Cost of Delay — the economic impact of time spent waiting.

If a feature generates €100,000 per month in revenue, every month of delay costs €100,000.

Queues have economic weight.

The invisible queue is not just a coordination problem. It is an economic problem.

“Cost of Delay is the golden key that unlocks many doors.” — Reinertsen

This connects directly to Coase’s transaction costs from Post 1. Transaction costs are not just the direct cost of negotiation and contracting. They include the queuing cost — the time spent waiting for transactions to be processed.

When Coase asked why firms exist, part of the answer is that firms can reduce queuing costs by internalising coordination. But as we have seen, firms create their own internal queues — approval chains, hand-offs, dependency waits.

The Coasean boundary is partly a queuing boundary.


Protocols as Queue Disciplines

In the previous post we described protocols as standardised patterns of interaction.

Queuing theory reveals what protocols actually do at a deeper level.

A queue discipline defines:

  • how arriving work is prioritised
  • what service guarantees exist
  • how capacity is allocated
  • how variability is managed

Protocols are queue disciplines.

An API contract defines arrival patterns and service expectations. A workflow engine defines routing and priority. An event system decouples producers and consumers, buffering variability.

Without protocols, all requests compete for a single queue.

With protocols, the interaction is structured. Priority is defined. Load is distributed. Variability is buffered.

In queuing terms, protocols reduce V (variability) and manage U (utilisation) — the two factors that drive wait times.

Well-designed protocols are well-designed queuing systems. Poorly designed protocols create invisible queues.


What AI Changes

AI agents change the queuing dynamics of the networked firm. But not in the way most people expect.

AI Reduces Service Time

The most obvious effect. AI processes certain work items orders of magnitude faster than humans.

Customer support first-response: from 6+ hours to under 4 minutes.

Code review: from days to minutes.

Document analysis: from hours to seconds.

In queuing terms, AI dramatically reduces T — the mean processing time.

This should reduce utilisation and therefore wait times.

AI Increases Arrival Rate

But AI also generates work.

AI agents create tickets, raise issues, produce code for review, generate documents that need approval.

In queuing terms, AI increases λ — the arrival rate.

If service time falls but arrival rate rises proportionally, utilisation remains unchanged.

The queue doesn’t shrink. It just processes faster — and fills faster.

AI Faces Brooks’s Law

Here is where the queuing lens becomes critical for the future of AI-augmented organisations.

Recent research from DeepMind reveals something that should give every AI strategist pause.

Adding more AI agents to a system often degrades overall performance.

Their analysis produced a formula:

Net Performance = (Individual Capability + Collaboration Benefits) − (Coordination Chaos + Communication Overhead + Tool Complexity)

The results are stark:

FindingImplication
Claude’s performance dropped 35% in multi-agent setupsCoordination overhead overwhelms capability gains
Error amplification factor of 17.2× with independent agent votingErrors compound across agent boundaries
Once a single agent reaches 45% accuracy, adding agents provides negative returnsMore agents make the system worse, not better
80% of production systems use human-designed control flowAutonomous agent coordination remains unreliable
68% of deployments limit agents to 10 steps or fewerLonger agent chains collapse under coordination cost

This is Brooks’s Law operating at machine speed.

Each additional agent creates new communication pathways. Each pathway is a potential queue. Each queue is subject to the utilisation curve.

The single-agent path has one queue. The multi-agent path has four — including the human review bottleneck that becomes the system’s constraint.

The Coordination Tax

The DeepMind findings reveal a pattern that maps precisely onto what we have seen with human organisations.

When agents operate independently on well-defined tasks with clear protocols, they perform well. The queue is simple: request in, result out.

When agents must coordinate — sharing context, resolving conflicts, building on each other’s outputs — performance degrades rapidly.

This is not a temporary limitation that better models will solve. It is a structural property of networked systems. The same queuing dynamics that make hierarchies slow and that cause Brooks’s Law in software teams apply to networks of AI agents.

The implication is profound.

The bottleneck in AI-augmented organisations is not intelligence. It is coordination.

The organisations currently deploying AI agents most successfully are not those with the most agents. They are those with the clearest protocols governing how agents interact — well-defined interfaces, structured hand-offs, explicit contracts for input and output.

In other words: they have designed good queue disciplines for their agent networks.

Digital Congestion

There is a second-order effect that few organisations anticipate.

AI agents are not just workers. They are also producers of work.

An AI code reviewer generates review comments that humans must process. An AI analyst generates reports that humans must evaluate. An AI agent that identifies issues creates tickets that humans must triage.

Each of these outputs enters a downstream queue — usually a human queue.

When AI accelerates production without proportionally accelerating consumption, the result is digital congestion: AI-generated outputs piling up in human review queues, overwhelming the very people the AI was meant to help.

Four agents feeding one human reviewer. The human becomes the constraint. Their utilisation spikes toward 100%. Wait times explode.

The solution is not to remove the human. It is to redesign the protocol — batching, prioritisation, automated pre-filtering, structured escalation — so that the human queue operates below the utilisation threshold.

This is a protocol design problem. And it points toward the emerging standards for agent-to-agent and agent-to-human coordination that will define the next phase of organisational design.


The Capacity Paradox

This creates a paradox for organisations adopting AI.

AI promises to increase capacity. And it does — for individual tasks.

But the networked firm is not a collection of independent tasks. It is a queuing network. And in a queuing network, local optimisation does not guarantee global improvement.

Adding AI capacity at a non-constraint creates no system-level benefit (Goldratt).

Adding AI that generates work faster than downstream queues can process it creates congestion.

Deploying multiple AI agents without coordination protocols recreates Brooks’s Law at machine speed.

The question is no longer “how many agents can we deploy?” It is “what coordination protocols govern how agents interact — with each other, with humans, and with systems?”

The organisations that benefit most from AI will not be those that deploy the most agents.

They will be those that design agent coordination as a queuing problem — with explicit protocols, managed utilisation, and the slack required to keep the network responsive.


Slack Is Not Waste

Traditional management treats idle capacity as waste.

Queuing theory shows the opposite.

Slack — spare capacity — is the operational margin that keeps the system responsive.

At 80% utilisation, the system still functions. At 95%, it collapses.

That 15% difference is not waste. It is the buffer that absorbs variability, handles unexpected demand, and prevents queue build-up from cascading through the network.

Slack enables:

  • absorbing unexpected demand
  • responding to incidents
  • learning and innovation
  • preventing cascading queue failures

Google’s 20% time was not a perk. It was, whether intentionally or not, a queuing strategy — keeping utilisation below the threshold where wait times explode.

The most responsive organisations are not the busiest. They are the ones with enough slack to absorb variability without collapsing.


The Three Laws, Revisited

In Post 10 we introduced three laws governing the networked firm.

Queuing theory provides the underlying mechanics.

LawWhat It DescribesQueuing Mechanism
MetcalfeValue of connectionsMore connections = more potential queues
BrooksCost of connectionsMore hand-offs = more queues in series
ConwayStructure of connectionsOrg structure determines where queues form

Coordination density — the ratio of productive interactions to coordination cost — is fundamentally a queuing metric.

High coordination density means interactions flow through low-utilisation, low-variability queues.

Low coordination density means interactions stack up in overloaded, high-variability queues.

The protocol layer governs these dynamics. The question is whether it governs them well.


From Physics to Mapping

Queuing theory gives us the physics.

But physics alone does not tell you where to look.

To apply these insights, you need to see the queues. You need to trace the flow of work from request to outcome, identify the boundaries where queues form, measure the waiting, and find the constraints.

That practice — mapping the actual flow of value through the organisation — is value stream mapping.

And beyond mapping lies an even deeper question. If agent coordination is a queuing problem, and if the protocol layer governs those queues, then the design of agent-to-agent protocols becomes the central challenge of the AI-augmented firm. How agents discover each other, negotiate capabilities, structure hand-offs, and manage shared state — these are not implementation details. They are the organisational architecture of the networked firm.

The physics of flow tells us what will happen if we get this wrong. The protocol layer tells us how to get it right.


References & Intellectual Lineage

  • Erlang, A.K. (1909). “The Theory of Probabilities and Telephone Conversations.” — founded queuing theory.
  • Little, J.D.C. (1961). “A Proof for the Queuing Formula: L = λW.”
  • Kingman, J.F.C. (1961). “The single server queue in heavy traffic.” — the VUT equation.
  • Goldratt, E.M. (1984). The Goal. — Theory of Constraints.
  • Brooks, F.P. (1975). The Mythical Man-Month.
  • Reinertsen, D.G. (2009). The Principles of Product Development Flow. — queuing theory applied to product development, Cost of Delay.
  • Forsgren, N., Humble, J., Kim, G. (2018). Accelerate. — DORA metrics and flow measurement.
  • Coase, R. (1937). “The Nature of the Firm.”
  • Post 10 in this series: Metcalfe’s Law, Conway’s Law, and the Networked Firm.
  • Post 12 in this series: The Protocol Layer.