Normalizing story points across teams

A ritual that exists almost entirely so a portfolio dashboard can add up two teams' numbers. The dashboard rarely justifies the cost.

Normalization is mostly a SAFe ritual to make a portfolio dashboard work. Resist it unless someone above you needs the dashboard.

Story-point normalization is the practice of getting two or more teams to agree on a shared reference story — usually something like "one person-day of work for an average engineer on an average codebase" — so that a 5 on Team A means the same thing as a 5 on Team B. Once normalized, the teams' velocities can be summed; a program manager can roll a multi-team backlog into a single forecast number. The technique exists because SAFe and similar scaled-agile frameworks need a portfolio-level point count to make their planning math work.

The technique works the way it's supposed to. A normalized 5 really does mean the same thing across the teams that agreed to it. The problem isn't precision; it's that you've traded each team's relative-estimation property for an absolute one, and you've done it just so a number can be averaged across teams that don't actually share a codebase, a stack, or a problem domain.

What normalization breaks

Relative estimation works because the team picks a reference story they shipped, and sizes new work against it. Other teams' calibrations are irrelevant — the only thing that matters is whether this team can ship at a rate they can sustain. Normalization replaces the team's reference with a synthetic one ("one engineer-day"), and the moment you do that you're estimating duration in disguise. The team that thought it had abandoned hour-based estimation now has an hour-based scale with extra steps. Story points vs hours covers the longer version of that failure.

The second failure mode: team comparison. With normalized points, "Team A delivered 80 points; Team B delivered 50" looks like a productivity comparison, and it gets read as one. The number is corrupted within two sprints — Team B sizes everything 30% bigger to keep up, Team A learns to over-decompose to look busy — and the program-level forecast that was supposed to be the whole point of normalization is now wrong. See velocity for the related "treat the number as a KPI" failure mode at a single-team level.

The one situation where it's worth doing

Portfolio-level forecasting in an organization that runs SAFe or an equivalent — and where someone above the teams genuinely needs a multi-team point total to fund work, not just to compare teams. In that case the ritual produces the number the funding decision actually depends on, and the cost is the price of the framework you've already committed to. The team's own estimation conversations still happen at the team-relative level; the normalized number is a translation layer at the top.

The honest version: have the program-management layer maintain a points-multiplier per team — calibrated once, revisited rarely — and apply it at rollup time. Each team estimates against its own reference story the way the technique is supposed to work; the multiplier turns the team's velocity into the portfolio's currency. The teams never see normalization in their refinement meetings. The dashboard adds up.

What to push back on

Normalization initiatives that originate inside a single team. If your scrum master proposes that the team normalize against the team next door, the answer is no: there's no portfolio-level math that depends on it, and you'll have given up the relative-estimation property for no reason. Normalization that comes down from a program-management layer with a real forecasting need is at least a coherent ask, even if the cost is real.

Also push back on normalization done to make team comparison easier. That isn't a use case; it's an anti-pattern with a polished name. Two teams' velocities aren't supposed to be comparable — they're property of two different teams in two different contexts. Common mistakes covers cross-team velocity comparison as a recurring failure mode in its own right.

If a portfolio dashboard needs the number, normalize at the rollup layer. Don't make the teams do it.

Adjacent: velocity for the single-team version of the same failure modes; story points vs hours for the absolute-vs- relative trap normalization opens; other estimation techniques for the bucket and affinity alternatives if normalization keeps getting forced on the team.

What normalization breaks

The one situation where it's worth doing

What to push back on

Keep reading