Reproducing the Bug Before Fixing It: Letting AI Chase the Cause While You Hold the Line on Proof

Something in your app is broken. A user tells you the thing they bought never showed up in their order history, or you click a button yourself and nothing happens, or an error flashes across the screen and disappears. Say you’re building a little tool that lets book clubs track what everyone’s reading. You take the error, or the description of what went wrong, and you paste it into your AI assistant and ask it to fix the bug. It responds fast and sure of itself: ah, the problem is here, this variable can be null, here’s the fix. You apply it. The code runs. Nothing’s on fire anymore. You move on, feeling like you handled it.

And maybe you did. Or maybe you just changed some code that had nothing to do with the actual problem, and the bug is still sitting there waiting, because the thing AI “fixed” was never the thing that was broken. You don’t know which of those happened, and that’s the whole trouble. You never saw the bug happen on purpose, so you have no way of knowing whether it’s gone. The error went away, but errors go away for all sorts of reasons, including the ones where you’ve quietly made things worse. The careless version of this job isn’t asking AI to fix a bug. It’s accepting a fix you can’t prove fixed anything, then calling the bug closed because the screaming stopped.

What this job actually is

Fixing a bug is two jobs that get crushed into one. The first job is investigation and repair: figure out what’s going wrong, form a theory about why, and write a change that should make it stop. AI is genuinely good at this. It can read a stack trace, trace a value back through your code, recognize a familiar failure pattern, and produce a plausible fix in seconds. Handing AI the hunt is smart, because it’s faster and more patient at tracing through code than you’ll be at 11pm.

The second job is proof: making the bug happen reliably, on demand, before you touch anything, so that when you apply the fix you can watch the bug disappear and know the fix is what disappeared it. That’s not investigation. It’s verification, and it’s a discipline more than a skill. A reproduction is a little experiment you set up so reality, not AI’s confidence, gets the final say on whether the bug is dead.

Here’s the distinction that matters: AI can generate the theory and the fix, but insisting on a reproduction first is yours. A fix that runs is not a fix that works. The whole value of debugging well comes from knowing, with certainty, that the specific thing that was broken is now not broken, and you only get that certainty by seeing the bug happen and then seeing it stop. AI can’t give you that, because AI doesn’t actually know whether its fix worked. It knows the code now runs without erroring, which is a different and much weaker claim. It will hand you a confident fix for a bug it never once watched happen, and it will sound exactly as sure whether it’s right or wrong.

How to delegate the hunt

So lean on AI for the part it’s good at, which is the investigation, but put a reproduction in front of the fix as a condition, not an afterthought. The move that makes this delegation work is reordering the steps. Before you let AI fix anything, make it help you reliably trigger the thing first.

Give AI everything you actually observed: what you did, what you expected, what happened instead, any error text, exactly. Then ask it to help you build the smallest, most reliable way to make the bug happen again. Ask what inputs, what sequence, what conditions reproduce it. Ask it to write a quick test or a short script that triggers the failure on command if that fits your app. The goal of this first pass is not a fix at all; it’s a reliable way to summon the bug whenever you want it. Once you can do that, then you let AI investigate the cause and propose the change.

What you don’t do is ask “what’s the fix?” as your opening question. That’s the move that skips the proof entirely, because it sends AI straight to a confident answer with nothing to test it against. The careless prompt produces a fix floating in midair, attached to nothing you can check. Keep the first ask on reproduction. Help me make this break reliably, then help me understand why, and only then show me the change. You’re asking AI to hand you a bug you can trigger at will, because a bug you can trigger is a bug you can prove you killed.

The judgment you keep

Refusing to accept a fix you haven’t verified against a reproduction is the call, and it’s yours because it turns on something AI structurally cannot do: confirm, from the outside, that the broken thing now works.

This is hard because AI’s confidence is genuinely persuasive, and most of the time the relief of a quiet screen is enough to make you stop looking. When the error message is gone, every instinct says you’re done. The judgment is in not trusting that quiet, in holding the line that a fix isn’t real until you’ve watched the exact failure happen and then watched the same steps stop producing it. For the book club app, if the bug is “a member’s finished book doesn’t show up on their shelf,” then proof means making a finished book fail to show up, applying the fix, and making that same book show up, with the same member and the same steps. Anything short of that is hope.

AI can’t make this call because it can’t see your running app the way you can. It can reason about the code, but it can’t click your button, take your user’s path, or watch your screen. It has no way to know whether the bug it theorized about is the bug your user actually hit, or whether its fix touched the real cause or just something nearby that made the symptom go quiet. Get this wrong and you collect invisible debt: bugs you believe are fixed, sitting in your app, waiting to surface in front of a real user at the worst possible time, except now they’re harder to find because you “already fixed” them. The reproduction is the one thing standing between a bug you killed and a bug you only annoyed.

Before you ship this job

Here’s what good delegation looks like, and the line it can’t cross.

The sample prompt. Something real you might send:

I’m building ShelfShare, an app where book clubs track what each member is reading. My main user is someone like Renata, who runs a twelve-person club and wants everyone’s current and finished books visible on a shared page. Here’s the bug: when a member marks a book as finished, it sometimes doesn’t appear in their “finished” list, but I’m not sure when it happens or why. Before you suggest any fix, help me reproduce this reliably. Ask me what details you need, and walk me through the inputs, the sequence, and the conditions that would make this failure happen every time, so I can trigger it on demand. If it helps, write a small test that makes the failure occur. Once we can reproduce it consistently, then help me find the cause and propose a fix. I don’t want a fix until I can watch the bug happen first.

Use this and you get a bug you can summon, which is a bug you can prove you’ve beaten. Copy it as-is and you’re reproducing Renata’s failure instead of your own, chasing a bug shaped like someone else’s app. ShelfShare’s broken thing is the finished-book list; yours is somewhere else entirely, and a reproduction only means anything when it’s built around the failure you actually saw.

The part you can’t hand off is the insistence on proof: refusing to call the bug fixed until you’ve reproduced the exact failure, applied the change, and watched that same failure stop happening. That verification is the decision, and it’s the thing the prompt above deliberately puts before the fix instead of after it.

How to check AI did its part: before you accept any fix, run the reproduction one more time on the broken code and confirm the bug still happens. Then apply the fix and run the exact same reproduction again. The bug should be gone. Now do the part people skip: revert the fix and confirm the bug comes back. If breaking and unbreaking the code makes the bug appear and disappear on cue, you’ve proven the fix is doing the work, and you can close the bug for real. If the bug is already gone before you apply the fix, or it stays gone after you revert, then something else changed and you don’t actually understand what you fixed, which means you haven’t fixed it yet.

What you get for doing it this way

Go back to that quiet screen and the relief of the error going away. The difference between accepting a fix because the noise stopped and accepting it because you watched the bug die is the difference between a bug list you can trust and one quietly full of problems you only think are solved. When you let AI chase the cause and you hold the line on reproducing first, you stop shipping invisible debt, you stop re-discovering the same bug three weeks later in front of a user, and you build the habit of letting reality, not confidence, tell you when something’s done.

AI can find the cause and write the fix faster than you can. The one thing it can’t do is watch your app and tell you the bug is truly gone, because that takes seeing the failure happen and then not happen, and only you are sitting in front of the running thing. That was always going to be your call. That’s the job: let AI hunt the bug, but don’t let it tell you the bug is dead until you’ve watched it stop breathing.

Reproducing the Bug Before Fixing It: Letting AI Chase the Cause While You Hold the Line on Proof

What this job actually is

How to delegate the hunt

The judgment you keep

Before you ship this job

What you get for doing it this way

Data Structures: Engineering for Vibe Coders

DevOps: Engineering for Vibe Coders

Human Roles in an AI-First World

The Real Competitive Edge in AI: Customization, Not the Model

Rethinking “Human in the Loop” for AI

Top Business Use Cases for AI in 2025

Leave a Reply Cancel reply

What this job actually is

How to delegate the hunt

The judgment you keep

Before you ship this job

What you get for doing it this way

Similar Posts

Leave a Reply Cancel reply