The Builder-Manager: Code to Understand, Not to Ship

Last month, Amazon mandated that every AI-generated code change requires senior engineer sign-off after autonomous agents caused a six-hour shopping outage and a thirteen-hour AWS meltdown. Meanwhile, Block fired 40% of its workforce and declared it would rebuild the company as a "mini-AGI". These two responses -- more human oversight and less human involvement -- cannot both be right. Unless the question itself is wrong.

The question every engineering manager is being asked right now is "should you still code?" I have been an engineering manager for over a decade, scaling teams from 3 to 18 engineers across multiple companies. I still write code every week. But I have not shipped a sprint feature in years. And that distinction -- between coding to ship and building to understand -- is the entire point most people in this debate are missing.

The Two Deaths of the Engineering Manager

Gartner predicts that by 2026, 20% of organizations will use AI to flatten their organizational structure, eliminating more than half of current middle management positions. Fortune labeled this "The Great Flattening" -- the coming extinction of the middle manager. And it is already happening. Block cut 4,000+ positions. Klarna replaced 700 customer service roles with AI, then had to urgently rehire humans -- even forcing engineers into call center roles -- after quality collapsed. Meta analyzed productivity data after AI rollout and found teams delivering the same output in fewer hours, prompting structural review.

Two archetypes of engineering manager are being killed off simultaneously.

Death 1: The Pure Manager. This is the EM who became a full-time meeting-and-spreadsheet operator. Their calendar is a wall of 1:1s, standup syncs, stakeholder updates, and planning ceremonies. They have not opened an IDE in three years. They manage through Jira tickets, Google Docs, and influence. Organizations are realizing that AI can handle coordination, status reporting, performance tracking, and even generate the talking points for their skip-level meetings. These EMs are the first to be flattened because their value proposition is the most automatable.

Death 2: The Player-Coach. This is the EM who codes 60% of the time and manages in the cracks. They pick up sprint tickets. They are on the critical path for features. Their team waits for their code reviews because they are also writing code and cannot context-switch fast enough. They burn out. Their management work suffers because they are debugging at midnight. Their code suffers because they are in meetings all morning. Jim Grey articulated this clearly in his March 2026 piece "The player/coach trap": "When managers code, teams lose leadership."

LeadDev's 2026 analysis drives the point home: management positions are becoming "dead ends" in many organizations. Engineers watched layers of management get removed overnight in the post-ZIRP era. The market for management roles is "particularly difficult."

So if pure managers die to AI flattening and player-coaches burn out from doing two jobs badly, what survives?

The same undisciplined approach to AI that I warned ICs about in my post on how developers should use AI -- the headless monkey workflow -- applies to managers too. An EM who codes without strategy is just a headless monkey with a calendar full of 1:1s.

The Bottleneck Moved. Most EMs Did Not Follow It.

In February 2026, Armin Ronacher published "The Final Bottleneck", one of the best pieces written about the shift AI is causing in software engineering. His core argument: AI has removed the coding bottleneck. But he himself -- the human carrying responsibility and accountability for shipped software -- is the final bottleneck. As long as humans bear that responsibility, the constraint has not been eliminated. It has moved downstream.

He uses the textile industry parallel from the industrial revolution. When spinning was automated, weaving became the bottleneck. When weaving was automated, dyeing became the bottleneck. Removing one constraint does not eliminate constraints. It moves innovation to the next one downstream.

In engineering teams today, the constraints are no longer "we cannot write code fast enough." The constraints are:

Evaluating AI output quality. The 2025 Stack Overflow Developer Survey found that 66% of developers spend more time fixing "almost right" AI code than they save generating it. 45% cite "AI solutions that are almost right, but not quite" as their number one frustration.
Maintaining architectural coherence. I documented this extensively in my post on AI agents and codebase maintenance. The SWE-CI benchmark showed 75% of AI agents break previously working code during maintenance iterations. CodeRabbit's analysis of 470 pull requests found AI-generated code produces 1.7x more issues than human-written code. Comprehension debt research shows agents generate code 5-7x faster than developers can understand it.
Deciding what to build. Mirek Stanek put it best in his career advice for 2026: "If AI can generate features in days and still two-thirds of them don't move the needle, it becomes painfully obvious the problem was never 'too slow coding' but 'building the wrong things.'"

The EM who can only manage cannot evaluate AI output. They cannot tell the difference between a clean PR and a well-formatted time bomb. The EM who understands the code and the business context is now the most critical person in the room -- because they are the only one positioned to bridge the gap between what the AI generates and what the organization actually needs.

AI writes 41-42% of global code in 2026. GitHub's data shows AI writes 46% of the average developer's code. 95% of engineers use AI tools at least weekly. The bottleneck has moved. The question is whether you moved with it.

The Builder-Manager: A Third Path

This is not the player-coach. The distinction matters.

A player-coach codes to ship features. They are on the critical path. The team waits for their PRs. They are an IC with a management title, doing both jobs and doing neither well.

A builder-manager builds to understand the system and validate AI output. They are never on the critical path. The team never blocks on their code. Their building time exists to maintain technical credibility, evaluate emerging tools, and create leverage for the team.

Here is what the builder-manager builds with their own hands:

Internal tooling and prototypes. MCP servers that connect the team's ticketing system to AI assistants. RAG pipeline evaluations. AI workflow tools. Things that make the team faster, not features on the roadmap. I built production MCP servers and took a RAG pipeline from 58% to 91% retrieval precision. Neither was a sprint feature. Both made the team dramatically more capable.

Proof-of-concept evaluations. When someone says "we should adopt this new AI tool" or "this framework could replace our current approach," the builder-manager evaluates it firsthand before asking the team to invest. You cannot evaluate what you have not used.

Architecture context files. CLAUDE.md files, .cursorrules configurations, architecture decision records. This is the context layer that makes AI agents effective for the entire team. I wrote about the system I built to auto-generate architecture documentation from code repositories -- a dual-indexing strategy that keeps docs current as the codebase evolves. This is builder-manager work.

Code review with deep context. Not rubber-stamping PRs but evaluating architectural impact, especially for AI-generated code. This is where the builder-manager's hands-on experience directly improves team output quality.

Here is what the builder-manager never builds:

Features on the critical path. If the team is waiting on your PR, you have failed as a manager. Full stop.

Production hotfixes under pressure. Your job is to ensure the team can handle this, not to be the hero.

Code that only you understand. If your code creates a knowledge silo, you have replicated the worst part of being an IC while keeping all the overhead of management.

The distinction between these two lists is the entire difference between the builder-manager and the player-coach. The player-coach is on the sprint board. The builder-manager is building the infrastructure that makes the sprint board move faster.

Some will argue this is just semantics -- you are still coding, so what is the difference? The difference is organizational dependency. When a player-coach goes on vacation, sprints slow down. When a builder-manager goes on vacation, the tools they built keep working. The player-coach creates a dependency on themselves. The builder-manager creates capability that persists independent of their presence. One is leverage. The other is a bottleneck wearing a management hat.

This blog is itself evidence of the model. Seven deeply technical posts while holding an EM title -- on MCP servers, RAG pipelines, AI agent guardrails, architecture documentation, scalable systems, and developer AI workflows. None of these are sprint features. All of them made my teams more effective. The blog is the builder-manager model in practice.

Dimension	Pure Manager	Player-Coach	Builder-Manager
Codes?	No	Yes (features)	Yes (tools/infra)
On critical path?	Never	Often	Never
Can evaluate AI output?	Poorly	Well (but time-starved)	Well (by design)
Team blocks on them?	No	Yes	No
Survives flattening?	No (replaced by AI coordination)	Maybe (if team is small)	Yes (unique value)
Time coding	0%	40-60%	10-20%
What they build	Nothing	Roadmap features	Internal tools, governance, POCs

The 20/30/50 Rule

I am not going to tell you to "code more" without telling you how much and at the expense of what. Here is the concrete time allocation framework I use:

20% Building -- Hands-on technical work: prototypes, internal tooling, POCs, deep code review
30% Enabling -- Architecture decisions, technical strategy, AI governance, team skill development
50% Leading -- 1:1s, hiring, stakeholder management, cross-team coordination, career growth

The old advice was that EMs should code 30% of their time. That advice was for writing features. The builder-manager codes 20% but on the right things. An MCP server that saves each engineer 2 hours per week across a 10-person team is 20 hours per week of leverage. That is more impact than any feature you could ship yourself.

The Pragmatic Engineer's 2026 AI Tooling survey revealed a gap that should alarm every engineering manager: Staff+ engineers use AI agents at 63.5%. Engineering managers trail at 46.1%. The people approving and governing AI-generated code are less experienced with the tools than the people generating it. That is like a building inspector who has never used a level.

The building time is how you close that gap. You cannot evaluate what you do not understand.

Jim Grey says coding steals from leadership time. He is right -- if you are coding features. He is wrong -- if your building time directly improves the team's capability. Building an MCP server is not the same as picking up a Jira ticket. One multiplies the team. The other makes you a bottleneck.

This ratio is not fixed. It adjusts by team size:

Team Size	Build	Enable	Lead
3-5 reports	30%	25%	45%
6-10 reports	20%	30%	50%
11-15 reports	10%	30%	60%
15+ reports	5%	35%	60%

At 3-5 reports, you are closer to the code. You have fewer leadership demands. You can afford 30% building time. At 15+, you are a director. You build 5% -- enough to stay current, run a POC, review an architectural pattern. But that 5% is the difference between leading with context and leading with slide decks.

Your New Job: Chief AI Quality Officer

Amazon's mandate -- senior sign-off for all AI-generated code -- is the canary in the coal mine. Every engineering organization will need an AI code governance policy by end of 2026. The EU AI Act compliance deadline in August 2026 makes this a legal requirement in regulated industries, not just a best practice. Gartner predicts that by 2027, 70% of software engineering leader roles will require GenAI oversight.

The EM who understands the code is uniquely positioned to own this. Here is what that looks like in practice:

Define risk tiers. Which systems need human review for every AI change -- payments, authentication, data pipelines -- versus which can have lighter oversight -- internal tools, admin panels, documentation. Not all code carries the same blast radius. Your governance should reflect that.

Track AI code quality metrics. I documented this in my post on AI agents breaking codebases: we started tagging AI-generated PRs and measuring bug rates. The result was 2.3x higher follow-up fix rates within two weeks for agent-generated code. That number informed how much extra review time we allocated. If you do not track this, you are governing blind.

Build the guardrails. CLAUDE.md files, architectural linting, regression test requirements, scope limits for agent PRs. I laid out the specific guardrails in detail -- never let agents self-merge, require regression test suites (not just unit tests), scope agent work to single-concern changes, and run architectural linting to catch drift. These are not theoretical recommendations. They are what I implemented after watching agents silently corrupt codebases for months.

Train the team. The Stack Overflow survey shows 80% of developers use AI tools, but only 29% trust their accuracy. 59% of developers use AI code they do not fully understand. The EM's job is to close that gap -- not by banning AI, but by establishing standards for how it is used. I wrote about this discipline in my post on structured AI usage for developers. The principles apply at the team level, and the EM is the one who sets the standard.

The Jellyfish 2025 State of Engineering Management Report found that 90% of engineering teams now use AI coding tools, up from 61% the prior year. 81% of developers expect significant development to shift from humans to AI. This shift is happening with or without governance. The question is whether someone with technical understanding is steering it.

LeadDev's uncomfortable predictions for 2026 nails the meta-problem: EMs are "damned if they do use AI, damned if they don't." They are pressured to show AI impact while questioned if they ignore it. The builder-manager resolves this tension by being the person who governs AI usage -- not from a policy deck, but from hands-on experience with the tools and their failure modes.

If your entire value as an EM is coordination, status reporting, and stakeholder updates -- and AI can do all of that -- then you do not have a job. You have a title. The AI governance role is how you build a value proposition that no AI and no senior IC can replicate: the intersection of technical depth, organizational context, and leadership authority.

Why Your Team Will Not Follow a Manager Who Cannot Read the Code

Here is the uncomfortable truth that the "EMs should never code" camp avoids.

The Stack Overflow data shows 46% of developers actively distrust AI output. Only 3% "highly trust" it. 75% say the number one reason to ask a human is "when I don't trust AI's answers." And experienced developers -- the ones you most need to retain -- are the most cautious. They have the lowest trust rate (2.6% "highly trust") and the highest distrust (20% "highly distrust").

Your senior engineers do not need a cheerleader. They need a peer who can evaluate their work and the AI's work with equal rigor. If the EM cannot evaluate AI output, they become a rubber stamp. The team knows it. Trust erodes.

Think of it this way. A hospital administrator who has never practiced medicine can manage budgets and schedules. But when there is a disagreement about a surgical procedure, the team follows the chief of surgery who still operates occasionally -- not the administrator who reads reports about surgery. Credibility comes from demonstrated competence, not positional authority.

This is not about technical ego or proving you can still code. It is about credibility. The builder-manager earns the right to make architectural calls because they have recent, hands-on evidence of what works. When they say "this AI-generated migration plan is dangerous," the team trusts it because they have seen the EM catch real issues in code, not just wave at dashboards.

Most EMs stopped coding out of comfort, not strategy. Meetings are easier than debugging. 1:1s feel productive even when they are not. The calendar fills itself. Coding requires uninterrupted blocks of time that EMs are too undisciplined to protect. And now that discipline gap is becoming a credibility gap.

There is a counterargument here that deserves direct engagement: "Good EMs scale through people, not through code." This is true. And the builder-manager scales through people by building tools that make those people more effective. An MCP server that saves each engineer 2 hours per week across a 10-person team is 20 hours per week of leverage. A CLAUDE.md file that prevents agent-generated regressions across 50 PRs per sprint is hundreds of hours of debugging avoided. That is scaling through people. The code is not the point. The capability it creates for the team is the point.

Boris Cherny, the creator of Anthropic's Claude Code, told the SF Standard: "We're going to start to see the title of 'software engineer' go away. It's just going to be 'builder' or 'product manager.'" If ICs become builders, the EM becomes the builder-leader. And the word "builder" implies you actually build something.

Gregor Ojstersek's analysis confirms that companies still desperately need strong EMs -- $350K-$500K total compensation. But the EMs commanding that compensation are not the ones who manage through Jira. They are the ones who combine technical depth with organizational leverage. The builder-managers.

Adam Ferrari's piece on "Will the Great Flattening Eliminate Engineering Management?" argues that flattening will hit an equilibrium. When teams reach 10:1 IC-to-EM ratios, quality degrades. The EM role survives. But only for those who provide value beyond coordination.

If you have not used Claude Code or Cursor on your own codebase in the last 30 days, you are not qualified to review AI-generated pull requests from your team. That is not a provocation. It is a statement about the minimum competence required to evaluate work your team produces daily with tools you have never touched.

Start Monday: The Builder-Manager Playbook

Here is what to do this week. Not next quarter. This week.

1. Audit your last month. Open your calendar. Categorize every block: building, enabling, or leading. If building is at 0%, you are exposed. If it is above 40%, you are a player-coach and your team is likely suffering for it. Find the number. Know where you stand.

2. Pick one internal tool to build. Not a feature. A tool. An MCP server that connects your team's ticketing system to Claude. A script that auto-generates architecture docs. A CLAUDE.md file for your main repo that gives AI agents the context they need to make useful contributions instead of confident mistakes. Start small. Ship it in a week. See how it changes your team's workflow.

3. Implement AI code governance. Tag AI-generated PRs. Run full regression suites on them. Measure bug rates over 30 days. My data showed 2.3x higher follow-up fix rates for agent-generated code -- know your number. If you do not measure it, you cannot manage it, and you definitely cannot defend your team's quality standards to the VP who just read about "10x AI productivity" on LinkedIn.

4. Close the agent gap. If your ICs use AI agents more than you do -- 63.5% of Staff+ engineers vs 46.1% of EMs -- you cannot evaluate their work. Spend two hours this week using Claude Code on your own codebase. Not reading about it. Using it. Build a mental model of where it excels and where it fails. That firsthand experience is what separates an informed leader from a rubber stamp.

5. Document one architecture decision. Write an ADR for the most recent significant technical choice your team made. This is the context layer that makes both humans and AI agents more effective. I wrote about ADRs in the context of building scalable systems -- the same discipline applies here. An ADR takes 30 minutes to write and saves weeks of confused agent output and misaligned team discussions.

The builder-manager is not a new idea. It is how the best engineering managers have always worked. The ones who built internal tools before "internal tooling" was a discipline. The ones who reviewed code not because it was assigned to them but because they wanted to understand what their team was building. The ones who prototyped approaches before asking their team to commit to them.

AI just made it impossible to fake.

In two years, "do you still code?" will not be an interview question for engineering managers. It will be a job requirement. Because the alternative is an IC with better tools and lower overhead who does not need a manager who cannot understand what they shipped.

The "I trust my team" EM is the new "I trust my tests" developer. Both are abdicating the responsibility to verify. Trust without the competence to verify is not trust. It is delegation of judgment to people who did not ask for it.

The engineering manager who does not code is being replaced by AI. The one who only codes is being replaced by a senior IC who costs less. The one who builds the right things -- tools, governance, context, trust -- is the one the team and the organization cannot afford to lose.

The Builder-Manager: Code to Understand, Not to Ship

The Two Deaths of the Engineering Manager

The Bottleneck Moved. Most EMs Did Not Follow It.

The Builder-Manager: A Third Path

The 20/30/50 Rule

Your New Job: Chief AI Quality Officer

Why Your Team Will Not Follow a Manager Who Cannot Read the Code

Start Monday: The Builder-Manager Playbook

Related Posts

Context Engineering > Prompt Engineering

Your CI/CD Doesn't Work for AI-Written Code

4 Agents, 1 Spec: Multi-Agent Orchestration That Works