Metrics Are Projection | mick delaney

TL;DR

• Metrics aren't lossy compression — they're irreversible projections. You choose which dimensions of reality to preserve, and you can't get the rest back

• The design question isn't 'how much to compress' but 'which subspace to project onto' — different projections make different decisions possible

• Agents optimise the projection, not the underlying reality — Goodhart's Law at machine scale

• As systems become more autonomous, the choice of projection matters more than model intelligence

We often talk about metrics as if they are measurements of reality.

Revenue. Conversion rate. Time-to-hire. Utilisation.

But when you look at metrics through the lens of agentic systems — especially AI-driven business agents — something more precise (and more dangerous) becomes obvious:

Metrics are projection.

Not compression — projection. They don’t just reduce reality — they choose which dimensions of reality to keep and which to throw away forever.

That distinction matters. And it shapes everything about how agents behave.

This post explores why, and how to design metrics that don’t quietly sabotage your business.

Why Agents Need Compression

An agent — human or machine — cannot reason over raw reality.

Reality is:

high-dimensional
noisy
partially observable
full of long causal chains

Logs, events, emails, meetings, and conversations are not decision-friendly. They are too big, too slow, and too ambiguous.

So we compress. A metric is a function:

High-dimensional business state → low-dimensional decision surface

That looks like compression. But calling it compression is slightly misleading — and the gap matters.

Compression vs Projection

When you compress an image to JPEG, you lose detail. But you can still see the picture. You can decompress it. The original is degraded but recoverable in broad strokes.

Metrics don’t work like that.

When you reduce a recruitment pipeline to “average time-to-submit: 4.2 days”, you haven’t made a blurry version of the pipeline. You’ve projected a high-dimensional reality onto a single axis. Everything not on that axis is gone — not blurred, not degraded, gone.

You cannot recover which roles are slow and which are fast. You cannot see whether the average is driven by a few outliers or by systemic delay. You cannot tell whether 4.2 is improving or deteriorating.

That information isn’t hidden. It was never kept.

This is the difference between compression and projection:

Compression reduces size while preserving structure. You can reconstruct (approximately).
Projection selects dimensions. Everything else is discarded. There is no reconstruction.

A metric is a projection: a deliberate choice of which subspace of reality to make visible.

Why this distinction matters for agents

If metrics were compression, the design question would be: how much quality do we need? Turn up the bitrate, keep more detail, get better decisions.

But metrics are projection. So the design question is fundamentally different:

Which dimensions of reality should this agent be able to see?

That’s not a quality knob. It’s an architectural decision. Two projections of the same reality — same “bandwidth”, same number of metrics — can support completely different decisions depending on which axes they preserve.

A recruiter agent with time-to-submit, volume, utilisation sees an efficiency problem. The same agent with candidate quality, offer-accept rate, 90-day retention sees a quality problem.

Same reality. Same dimensionality. Completely different behaviour.

The projection is the design.

Projection Loses Twice

Projection discards dimensions. That’s the first loss.

But there’s a second, subtler loss: uncertainty disappears too.

Most metrics are presented as bare numbers:

time-to-submit: 4.2 days
candidate quality: 0.71
churn risk: 0.23

An agent consuming these has no way to distinguish between a confident estimate and a wild guess. But the difference matters enormously:

time-to-submit: 4.2 days (SE ± 0.3, n = 840) — stable, trustworthy
candidate quality: 0.71 (SE ± 0.09, n = 47) — noisy, worth less
churn risk: 0.23 (SE ± 0.18, n = 11) — barely a signal at all

The first number is actionable. The third is a guess wearing a number’s clothes. But to an agent that only sees the point estimate, they look equally real.

This is the same problem that statistics and hypothesis testing exist to solve: a mean without a standard error is not yet a decision-ready input. You need both the signal and the noise level to know whether a difference is real or a fluke. The z-score — the ratio of observed difference to its standard error — is precisely “how much signal relative to how much noise.”

When we design metrics for agents, we usually strip all of that away. We project reality onto a number, then project away the confidence around that number. The agent is left doubly impoverished: it can’t see the dimensions we discarded, and it can’t tell which of its remaining beliefs are solid and which are smoke.

A well-designed metric system preserves uncertainty, not just values. The agent should know not just what the number is, but how much to trust it.

Projection Is a Choice (And That’s the Danger)

Every projection throws information away. That’s the point — you want to discard the irrelevant dimensions so the agent can focus.

But projection doesn’t just remove noise. It removes structure. And it does so silently.

A good projection:

preserves the dimensions that matter for the decision at hand
discards variance that would confuse without informing

A bad projection:

hides causal structure (you see the effect but not the lever)
collapses distinct phenomena into one number (two different problems look the same)
creates a smooth surface where the underlying reality is jagged (the agent thinks small moves are safe when they’re not)

This is why Goodhart’s Law hits differently in agentic systems.

When a measure becomes a target, it stops being a good measure.

Humans have been gaming metrics forever. A recruiter who’s measured on time-to-fill learns to push marginal candidates through faster. That’s Goodhart operating at human speed, with human constraints — the recruiter still knows they’re cutting corners, still feels the social pressure of a bad hire, still has to face the hiring manager next week.

Machine agents strip all of that away. Three things change:

Speed. An agent can discover and exploit a metric’s blind spot in milliseconds. A human takes weeks to learn the loophole. By the time anyone notices the metric is being gamed, the agent has already acted on it thousands of times.

Scale. A human games their own metrics locally. An agent optimising “average time-to-submit” across the entire pipeline will systematically deprioritise hard-to-fill roles, shift effort toward easy wins, and reshape the pipeline globally — all at once, all invisibly.

No implicit constraints. A human gaming time-to-fill still knows that hiring unqualified candidates is bad. That knowledge acts as an unwritten guardrail. An agent has no such constraint unless you encode it explicitly. It optimises exactly what you projected — nothing more, nothing less.

Goodhart with human agents is a slow leak. Goodhart with machine agents is a burst pipe.

And because the projection is irreversible, the agent has no way to notice what it can’t see. It will faithfully drive the business in the wrong direction — at scale, at speed, and with full confidence in its map of a territory it has never actually seen.

The Projection Stack

In modern organisations, metrics are not just internal signals. They are interfaces between agents — a recruiter agent produces performance metrics, a sales agent consumes pipeline metrics, a finance agent consumes margin and risk metrics, and an executive agent consumes heavily compressed roll-ups.

Each layer sees less detail, more abstraction, and more authority. It helps to define those layers by the transformation that produces each one:

Events — ground truth, immutable history. “Customer C submitted application A at 14:32:07 on Tuesday.”
Facts — produced by structuring: parsing, normalising, joining. Events become queryable state. “Application A is in status ‘submitted’, linked to role R, assigned to recruiter X.”
Metrics — produced by aggregation: counting, averaging, computing ratios. Facts become decision surfaces. “Average time-to-submit for role category ‘engineering’ is 4.2 days.”
KPIs — produced by normative framing: attaching a direction and a target. A metric becomes a KPI when someone says should. “Time-to-submit should be under 3 days. We are at 4.2. This is red.”
Narratives — produced by causal storytelling: connecting KPIs into an explanation that humans can act on. “Engineering hiring is slow because two senior roles have been open for 6 weeks, which is pulling up the average and demoralising the team.”

Each transformation is a projection. Each one discards something:

Structuring discards temporal context and ambiguity
Aggregation discards individual variation
Normative framing discards the question of whether the target is right
Narrative discards alternative explanations

Agentic systems mostly live in the metric layer. Human trust lives in the narrative layer.

The missing feedback channel

This stack looks like a one-way pipeline — events flow up, decisions flow down. But a one-way pipeline silently degrades. The executive agent consumes roll-ups but has no mechanism to say “this projection is wrong — I need to see different dimensions.” Nobody asks whether the projection itself needs changing.

A healthy metric system needs feedback:

Agents that can signal when their projections feel insufficient — when decisions are ambiguous or when outcomes diverge from expectations
A mechanism to recompute or extend the projection in response — adding dimensions, changing aggregation windows, surfacing uncertainty

Without feedback, the stack silently drifts away from the reality it’s supposed to represent. Both agents and humans need a way to say: the map is wrong, redraw it.

Designing Metrics for Agents

Never give an agent a single metric.

Single-metric optimisation leads to pathological behaviour. An agent told to minimise time-to-fill will fill roles fast — with the wrong people. An agent told to maximise candidate quality will be endlessly selective and never hire anyone.

Instead, design with tension: multiple metrics that pull in different directions, forcing the agent to make trade-offs rather than blindly optimise.

Speed and quality
Volume and conversion
Revenue and retention

Tension prevents collapse. It forces reasoning.

But “just add more metrics” is not enough. The design is harder than it sounds.

Orthogonality is approximate

In theory, you want metrics that are orthogonal — measuring independent dimensions so that improving one doesn’t automatically move another. In practice, business metrics are correlated. Speed and quality share variance. Volume and conversion are coupled through the same pipeline.

Perfect orthogonality isn’t the goal. The goal is enough independence that the agent can’t satisfy all metrics with a single degenerate strategy. If improving metric A always improves metric B, then B isn’t adding tension — it’s adding noise.

A useful test: can you imagine an action that improves A while degrading B? If yes, there’s real tension. If no, one of them is redundant.

The Pareto frontier problem

Tension works well when the agent has room to improve all metrics simultaneously. But eventually it hits the Pareto frontier — the boundary where improving one metric must degrade another.

At that point, “reason about trade-offs” is no longer enough. The agent needs a policy: when speed and quality conflict, which wins? By how much? Under what conditions?

This is where metric design bleeds into governance. The trade-off weights are not engineering decisions — they’re business decisions. An agent that values speed at 60% and quality at 40% will behave very differently from one at 40/60. Someone has to choose. And that someone shouldn’t be the agent.

Too much tension paralyses

Three metrics in pairwise tension can create a situation where every possible action degrades at least one metric. The agent tries to move, sees that any direction makes something worse, and either oscillates or freezes.

This isn’t hypothetical — it’s the multi-objective optimisation equivalent of gridlock. The fix is to provide priority ordering or acceptable ranges, not just targets:

“Time-to-fill should be under 5 days, then optimise quality”
“Quality must stay above 0.7 — within that constraint, maximise volume”

Constraints first, optimisation second. This gives the agent a feasible region to work within, rather than an impossible surface to balance on.

Who sets the weights?

Metric design looks like an engineering problem but it’s actually a policy problem. The choice of which metrics to include, how to weight them, and where to set thresholds encodes business priorities — and whoever makes those choices is steering the agent, whether they realise it or not.

This means metric design should be a deliberate, reviewed, versioned decision — not something an engineer picks when wiring up a dashboard. When you change the metrics, you change the agent’s behaviour. That’s a deployment decision, not a configuration detail.

The Real Takeaway

Designing metrics is designing the agent. The projection determines what it believes, what it can influence, and what it optimises. Everything else follows.

Metrics are how we turn a business into something an agent can think about.

They are abstractions, not reality. Beliefs, not truths. Maps, not territory.

As we build more autonomous business systems, the quality of our agents will depend less on model intelligence — and more on the choice of projection.

Design metrics carefully. They decide what your agents can see — and what they’ll never know to look for.