codeintelligently
Back to posts
Developer Productivity

DORA Metrics Explained: What They Measure and What They Miss

Vaibhav Verma
8 min read
DORA metricsengineering metricsdeveloper productivitydeployment frequencyengineering performance

DORA Metrics Explained: What They Measure and What They Miss

DORA metrics have become the default language for engineering performance. Every VP of Engineering I talk to mentions them within the first five minutes. Tooling vendors plaster them across landing pages. Conference talks treat them as gospel.

And they're genuinely useful. But they're also incomplete in ways that can hurt you if you don't understand the gaps.

I've implemented DORA tracking at three different companies. Each time, I learned something new about where these metrics shine and where they quietly mislead.

The Four Metrics, Quick and Dirty

The DORA research program (now part of Google Cloud) identified four metrics that separate high-performing engineering teams from the rest:

1. Deployment Frequency How often your team deploys to production. Elite teams deploy on demand, multiple times per day. Low performers deploy between once a month and once every six months.

2. Lead Time for Changes The time from code committed to code running in production. Elite teams: under one hour. Low performers: between one month and six months.

3. Change Failure Rate The percentage of deployments that cause a failure in production (requiring a hotfix, rollback, or patch). Elite teams: 0-15%. Low performers: 46-60%.

4. Mean Time to Recovery (MTTR) How long it takes to restore service after an incident. Elite teams: under one hour. Low performers: over six months (yes, really).

These four metrics correlate strongly with organizational performance. Companies with elite DORA metrics ship more, fail less, and recover faster. That's valuable.

What DORA Gets Right

DORA metrics work because of three design decisions:

They measure outcomes, not outputs. DORA doesn't count lines of code or story points. It measures how effectively the system delivers value. This is a critical distinction. A team can write thousands of lines of code and still have terrible DORA numbers if their pipeline is broken.

They measure the system, not individuals. DORA metrics are team-level and org-level metrics. You can't game them by working nights. You can only improve them by fixing structural problems: flaky tests, slow CI, manual deployment gates, inadequate monitoring.

They balance speed with stability. Deployment frequency and lead time measure speed. Change failure rate and MTTR measure stability. You can't optimize one at the expense of the other and still look good. A team that deploys 50 times a day but breaks production every third deployment has terrible DORA metrics. So does a team that ships once a quarter with zero failures. The framework rewards teams that are both fast and stable.

What DORA Misses

Here's where my experience diverges from the conference talks. DORA has real blind spots, and ignoring them will lead you astray.

1. Developer Experience Is Invisible

DORA metrics can look great while developers are miserable. I've seen this firsthand. A team had elite deployment frequency and solid lead times. Their DORA dashboard was green across the board. But developer satisfaction was in the gutter.

Why? The pipeline was fast because developers had automated everything. But the automation was fragile, poorly documented, and understood by exactly two people. Everyone else lived in fear of the deploy scripts. The "fast lead time" was achieved by skipping code review for "small" changes, which meant bugs crept in that didn't show up as incidents but degraded the user experience.

DORA told us everything was fine. Talking to the team told a different story.

2. Quality Is Underrepresented

Change failure rate is the only quality signal in DORA, and it only captures failures bad enough to require a rollback or hotfix. It misses:

  • Performance regressions that don't trigger alerts
  • Accessibility violations
  • UX degradation that users tolerate but hate
  • Technical debt accumulation
  • Security vulnerabilities that haven't been exploited yet

You can have a 0% change failure rate and still ship garbage. As long as it doesn't break, DORA won't flag it.

3. Value Delivery Is Absent

DORA tells you how fast you can deliver code. It says nothing about whether that code was worth building. A team can have elite DORA metrics while building features nobody uses. Speed of delivery is necessary but not sufficient. If you're efficiently building the wrong things, you're efficiently wasting money.

4. Cross-Team Dependencies Are Hidden

DORA metrics are typically measured per team. But in any organization of meaningful size, the biggest productivity killers are between teams, not within them. Waiting for another team's API. Coordinating a cross-service migration. Negotiating shared infrastructure changes.

I've watched individual team DORA metrics stay green while organization-wide delivery slowed to a crawl because cross-team coordination was a mess.

5. Cognitive Load Isn't Captured

A team can have fast deployment pipelines and still be slow because every change requires understanding six microservices, three databases, and a message queue. DORA measures the pipeline. It doesn't measure how hard it is to figure out what code to write in the first place.

The Stealable Framework: DORA+3

Here's what I recommend: use DORA as your foundation, then add three supplementary signals. I call this DORA+3.

DORA Metric 1-4: Deployment Frequency, Lead Time, Change Failure Rate, MTTR. Track these. They're your delivery pipeline health indicators.

+1: Developer Experience Score Run a quarterly survey. Ask developers to rate (1-5 scale) their satisfaction with: build tools, CI/CD, code review process, documentation, on-call experience, and overall ability to focus. Track the trend. If it's declining while DORA looks good, you have a hidden problem.

+2: Time-to-10th-PR for New Hires How long does it take a new developer to submit their 10th pull request? This single metric captures onboarding quality, documentation, codebase complexity, and team support. I've seen this range from two weeks (great) to three months (terrible). It's a surprisingly sensitive indicator of overall developer experience.

+3: Feature Cycle Time Track the time from "feature kickoff" (product and engineering agree to build something) to "feature in users' hands." This captures all the non-coding time that DORA ignores: design, planning, cross-team coordination, QA, and rollout. If your DORA lead time is one hour but your feature cycle time is three months, the bottleneck isn't your pipeline.

The Bottom Line

DORA metrics are the best starting point we have for measuring engineering performance. Use them. But don't stop there.

The teams I've seen get the most value from DORA are the ones that treat the four metrics as necessary-but-not-sufficient signals. They combine DORA with direct developer feedback, quality indicators, and business-level cycle times to get the full picture.

The teams that struggle are the ones that optimize for DORA scores. They hit elite benchmarks on the dashboard while their developers are burned out, their codebase is rotting, and their customers are waiting months for simple requests.

Measure the system. But remember that the system includes the humans operating it.

$ ls ./related

Explore by topic