Are We Becoming QA for the Machine?
I was calmly yelling at my CLI because the button wasn’t the same green as stated in CODE_STANDARDS.md, and I realized I wasn’t building an app, I’ve spent the last hours only testing it.
I recently built my first real application almost entirely with coding agents. The app worked, which kind of surprised me. I mean I didn’t write any code, and sure I knew what to say, but still. And it felt great, and productive and fast. But somewhere, I stopped being the software engineer who builds, and became what felt like a glorified tester.
Actually to be honest, it was even worse. At various points in the execution I went like “I don’t know what I should add, what do you think Opus?”. He told me, he did it, I tested it, and said “oh great, what’s next?”. Rinse, repeat, descend into madness.
So yeah, that’s probably my fault. But who hasn’t done this? Or who hasn’t asked ChatGPT what the next best SaaS idea is that he told nobody about yet? And then started building it?
I’ve since written more non-trivial apps with coding agents, and the limitations in them are getting quite clear. Letting them steer the process is a big mistake, in most cases. But they are getting to be at the level that they are genuinely good at coming up with things. I’m a bit afraid of when they get just that bit smarter than me.
Don’t get me wrong, it’s fun seeing your idea come alive this quickly. But that’s all you’re doing; you’re not engineering it, you’re testing if it does what you think it should do. Which nobody bothered to write down. And when someone did, it was the AI.
Nobody has ever been able to properly spec a non-trivial piece of software, so I don’t think that’s going to change anytime soon. If agentic coding tools plateau here, I think this is what we become: we take whatever thought bubble we manage to capture from somewhere in the organization, refine it, put it in an agent, and keep iterating until we think it’s fine. Then we release it to prod, get feedback that it’s completely wrong, and iterate further.
I guess that sounds familiar.
I found myself exhausted after these sessions, more than after a day of regular coding. Probably it’s me trying to keep up with the machine. I’m not used to being the bottleneck in software development, but now I get to open 3 terminals that all work on something different in the same codebase, plus 9 more for 3 other projects. I try to match the output speed with multitasking and context switching, planning for conflicts between them. It’s like air traffic control, but now all passenger planes are F16’s.
Or maybe it’s the constant abstraction. You have to continuously discuss the concept, dig up what something is supposed to do, articulate what you’d like to see, instead of taking the time to make a part of it happen. There’s a specific kind of fatigue that comes from translating intent all day without ever touching the material yourself.
It’s addictive though, isn’t it. Once you start, it’s hard to go back to the old way of working, without a personable companion that is always around. And I think that’s a problem.
An exception in this codebase? Let’s just paste it in Codex. A bug report? Let’s ask Claude. Claude says something I’m not sure about? Let’s ask Gemini what it thinks.
It doesn’t feel healthy. The feedback loop is so tight that you never have to sit with a problem long enough to actually understand it. You just keep throwing it back at the machine, and you can just pick out what sounds best. Again probably my fault, but again, they make it so easy and attractive.
And the agents keep getting more capable. But when I step back and look at my own workflow, I also wonder: where does this end? What am I actually getting better at?
The job used to be building the thing, crafting it from minute details up to something that was more than the sum of its parts. Like a carpenter building a house beam by beam, fitting every joint, feeling where the wood resists. Now it’s more like snapping together prefab walls; the house goes up faster, but you never really held the wood.
It isn’t completely different, but it isn’t the same. It’s fun but it isn’t the same. I think I’ll miss it.
Let me ask ChatGPT how to feel about this.