Theory of LLM Constraints: Why Coding-Side Acceleration Doesn't Deliver

Description
Applying Goldratt's Theory of Constraints to AI delivery. Coding-side speed shifts bottlenecks downstream; the ASDLC answer is proportional verification from a minimal context-specific scaffold.
Status
Experimental
Last Updated
Tags
Theory, Productivity, Metrics, Bottlenecks, Industrialization

Definition

The Theory of LLM Constraints is the application of Eliyahu Goldratt’s Theory of Constraints (TOC) to AI-assisted software delivery. Its central claim: because LLMs collapse the marginal cost of code generation, the system bottleneck structurally shifts from the coding station to review, integration, and validation. Local optimization at the coding station produces no net throughput improvement.

The thesis is empirically grounded. Thoughtworks reports ~30% coding-task acceleration yielding only ~8% net delivery improvement. Faros telemetry across 10,000+ developers reports 98% more PRs merged but a 91% increase in PR review time, with PRs 154% larger on average. METR’s randomized controlled trial finds experienced open-source developers 19% slower with AI tools — while reporting they feel faster. The constraint moved. Most teams did not.

The Diagnostic: Where the Constraint Moves

In a traditional SDLC, coding is one constraint among several. When LLMs accelerate it, three things happen at once:

  • Inventory build-up at review. PR queues lengthen. Faros: 98% more PRs, 154% larger on average.
  • Quality leakage. Faros: 9% more bugs per developer. DORA 2025 publishes the first official benchmarks for Deployment Rework Rate — the signal for work re-done after being declared complete.
  • Perceived velocity diverges from measured velocity. METR’s RCT finds experienced developers 19% slower with AI while reporting they feel faster. The dashboard and the timesheet disagree.

The constraint is no longer “how fast can the human type.” It is “how fast can the system verify intent.”

Two System-Level Diagnostics

SignalWhat it measuresWhat it surfaces
PR Cycle TimeThroughput at the new constraint (review / integration)Where inventory accumulates
Deployment Rework Rate (DORA 5th metric)Quality leakage past gatesWhere the constraint is being bypassed rather than respected

Tracking only the first invites the Faster Horse: throughput goes up while rework hides the cost. Tracking both keeps the dialectic honest.

The Dialectic: Classical TOC vs. ASDLC

Goldratt’s Five Focusing Steps prescribe: identify the constraint, exploit it, subordinate everything else, elevate it, repeat when it moves. Applied naively to PR queue saturation, “elevate” means add reviewer capacity, parallelize review, accelerate human approval.

This is the Faster Horse Fallacy. It optimizes the broken station rather than redesigning the line. The ASDLC accepts Goldratt’s diagnostic and rejects his classical cure:

StepClassical TOC remediationASDLC structural remediation
IdentifyCoding is no longer the constraint; review is.Same.
ExploitSqueeze every drop from existing reviewers.Reduce review surface via Micro-Commits and Specs.
SubordinateCap upstream PR generation to match review capacity.Cap unverified generation; let verified generation run.
ElevateAdd reviewers, parallelize review, AI-assist review.Replace peer review with inspection stationsAdversarial Code Review, Context Gates, Constitutional Review.
RepeatWhen review is no longer the constraint, find the next one.Same — production observability via Feedback Loop Compression.

The substitution at Elevate is the entire point. Adding reviewers is linear; structural verification is multiplicative. This is the same position PR Slop takes from a different angle: “PR slop cannot be solved by reviewing harder.”

Minimal Scaffolding: Proportional Gates

A common prescriptive response to the constraint shift is Scaffolding-First Development: pre-building the validation surface (tests, NFRs, CI/CD) before generating feature code. This invokes Deming’s “build quality in, don’t inspect it in” at agent speed.

The ASDLC adopts a refined stance: Minimal Scaffolding-First.

For any given context (e.g., a TypeScript web application), there is a sensible, minimal scaffold that should always be used from the outset. This baseline includes essential constraints like compilation checks, type-checking, and basic syntactical linting.

However, beyond this minimal baseline, we must avoid overtly adding scaffolding. The ASDLC is opinionated on having the minimal, and only the minimal, scaffold to start. Additional gates should not be added as a ritual playbook; they must earn their place by protecting against a specific, identified risk.

A gate that does not sit on the constraint or protect against a named risk represents overproduction of validation inventory (unsubordinated capacity).

  • Minimal Baseline (Start here): Context-specific, low-cost guards (e.g., type-checkers, basic compiler checks).

  • Proportional Additions (Earn their place): Additional gates (e.g., visual regression tests, performance-budget checkers, custom security policies) are built only when a concrete failure mode is identified.

  • A linter is a no-brainer: the cost is near-zero and the failure modes it catches (syntax errors, dead code, style drift) are broadly identified across all code.

  • A type-checker is a no-brainer in a typed language for the same reason.

  • A unit test suite for a payment-handling module earns its place the moment the module exists, because the risk surface is concrete and named.

  • A design-system drift check earns its place once the design system is load-bearing and the divergence risk is concrete and named — not as default infrastructure.

  • A performance-regression gate earns its place once the system has a named performance contract and an identified path to breaching it.

This is not anti-discipline. It is the recognition that a gate that does not sit on the constraint costs more than it saves, and that the constraint moves. Building gates is itself a form of work; the work is subject to the same proportionality TOC and Lean prescribe for any other process step. The discipline is in the analysis that identifies the risk, not in the ritual of building every gate the playbook names.

ASDLC Usage

This concept gives the ASDLC’s bottleneck-shift thesis a theoretical name (Goldratt’s TOC) and an empirical anchor (Faros / Thoughtworks / DORA 2025 / METR). It positions the ASDLC’s verification architecture as the structural answer to the constraint shift, distinguishing it from two regressive defaults the industry will otherwise reach for:

  1. Classical TOC elevation — adding reviewer capacity. Optimizes the broken station.
  2. Strong Scaffolding-First — pre-built canonical gates. Overproduces verification inventory.

The ASDLC answer is proportional structural verification starting from a context-specific minimal scaffold, introducing additional gates only as they earn their place.

See also: Feedback Loop Compression (the OODA-framed view of the same shift), PR Slop (the quality-leakage manifestation), AI Amplification (why bad processes amplify under AI), Context Gates (how proportional gates are layered in practice).

References

  1. Mats Ljunggren (2026). Theory of LLM Constraints . Accessed May 18, 2026.

    Applies Goldratt's Theory of Constraints to LLM-augmented delivery; synthesizes Faros, Thoughtworks, DORA, and METR telemetry into the bottleneck-shift thesis.

  2. Faros AI (2025). AI Productivity Paradox Report 2025 . Accessed May 18, 2026.

    Telemetry across 10,000+ developers and 1,255 teams. 98% more PRs merged, 91% PR review time increase, 154% larger PRs, 9% more bugs.

  3. Thoughtworks (2025). How much faster can coding assistants really make software delivery? . Accessed May 18, 2026.

    Empirical decomposition: ~30% coding acceleration yields ~8% net delivery improvement because coding is roughly half of cycle time.

  4. Google Cloud / DORA (2025). 2025 DORA Report: State of AI-Assisted Software Development . Accessed May 18, 2026.

    First official benchmarks for Deployment Rework Rate, the 5th DORA metric — the quality-leakage signal complementing PR Cycle Time.

  5. METR (2025). Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity . Accessed May 18, 2026.

    RCT finding experienced open-source developers were 19% slower with AI tools while perceiving themselves as faster. Counter-evidence to perceived-acceleration claims.