Claude Rides Shotgun

Almost everything I've written about Claude has Claude in the driver's seat. The agent picks up a task, runs it end to end, asks the occasional question, and the human stays mostly out of the way. But Claude can be just as useful in a support role.

The Haunted House

I've been working on a VR prototype. It's a haunted house experience that our church used to put on. We did it for years until COVID shut it down. After that, we started talking about doing a virtual version — fully immersive, the kind of thing you'd experience in a headset. But it just never came together. So when I had a little money to throw at it, I outsourced the first pass to a network resource. They did a good job. Got me a starting point. But there was a lot of detail work left, and in VR, detail is the whole game. It's the difference between a tech demo and something that actually puts a knot in your stomach.

So I picked it up myself. There's a solid MCP interface for Unity, so I got the server running, attached Claude Code, and off we went.

I knew what I wanted. I didn't know how to do it.

There's a scene where the player is sitting in the back seat of a car as it rolls down a street. You're looking over the passenger's shoulder, watching houses go by. To get the sightlines right — so you can actually see what's coming — we had to lift the camera up a little. But lift it enough and the camera clips through the roof of the car. At certain angles, you're sitting in the back seat looking out the world's worst sunroof.

I had a thought. Put a cube around where the player's head is. If they look up, they see black. Looks like the inside of a car roof. Problem solved.

I told Claude. And, as Claude often does, it told me this was the greatest idea anyone had ever had. I would never have thought of something so clever. Truly remarkable. (If you've used Claude for more than a week, you know the routine. I've written about this.)

So after the flattery session, it goes to work. First thing it does is make a cube bigger than the car itself. So we start going back and forth. I tell it that's way too big. It explains that you can't put a shader on the inside of a single cube and have it render the way you need — when your head is inside it, you just see nothing. So I suggest building the box out of individual planes instead. Great idea, it says. Of course it does. It builds it, and this time we're closer.

Then we hit the fine-tuning, and the whole thing slowed to a crawl.

Claude can get screenshots. It can look at a frame and reason about what's in it. But it can't see the way I can see — in real time, with my head moving, with the spatial sense of being in the scene. The loop was too long. I'd ask for a tweak, wait, get a screenshot back, look at it, ask for another tweak, wait. Meanwhile I could see the problem instantly with my own eyes. What needed to happen was obvious to me. But the communication overhead of getting Claude to understand what I was seeing was murdering the iteration speed.

So I took the controls. I told Claude I was going to make the changes myself. And then I asked the question I'd end up asking about a hundred more times that day: how do I do this in Unity?

That's where it lit up. Not the do-everything mode. The teaching mode. Where do I find this menu? What's this property called? Why is it doing this when I do that? Claude has read more Unity documentation than any human ever will. I had the eye and the hands. It had the manual. We made better progress in twenty minutes than we'd made in the previous two hours.

Same Pattern, Different Tool

A few days later I needed a baby bassinet model. Couldn't find one that fit. I'd never built anything in Blender from scratch, but there's a good MCP server for Blender too. So I installed it, hooked it up, and said: make me a baby bassinet.

Claude went to work. It did the heavy lifting — the geometry, the rough proportions, getting something on the screen. And then the same thing happened. The fine-tuning was where the eye mattered. I could see in a second what was wrong. Telling Claude what was wrong, waiting for it to make the change, looking at the result, telling it again — too slow. So I took the controls. And again, the same question on a loop: how do I do this in Blender?

Two different tools. Two different domains. Same exact pattern. I knew what I wanted. I didn't know how to do it. Claude did.

The Expert on the Phone

Here's the thing that was different about this. In my normal agentic workflow, Claude does the work. I hand it a task, point it at the engineering standards, and let it go. It writes the code, runs the tests, handles the iteration. I'm mostly staying out of the way. That's the mode I write about the most, because that's where the results look like magic.

But the VR work wasn't that. Claude was still doing most of the heavy lifting — it built the cube, it built the planes, it generated the bassinet geometry. It was doing the work. Until it hit a wall. And the wall was the same both times: it couldn't see what I was seeing, not fast enough, not in real time, not the way you need to when you're adjusting visual details by feel. It could do the rough work, but the fine work needed eyes it didn't have.

So when I took over, I didn't become the expert. I became the hands. Claude was the expert who couldn't reach the controls. It's like having the best mechanic you've ever met on the phone while you're standing under the hood. You can see the problem. You can reach the part. But you don't know what it's called, you don't know which way to turn it, and you definitely don't know why the thing next to it is making that noise. The mechanic does. They can't see what you're looking at, but they've worked on a thousand of these, and when you describe what you're seeing, they know exactly what to tell you.

That's what Claude became in those sessions. The person who'd been there before and could talk me through it while I had my hands on the thing.

Knowing When to Switch

Everyone wants to know what AI can do. Can it write code? Can it build a 3D model? Can it edit video? Those are fine questions, but they're the wrong ones. The better question is: when should I let it do the work, and when should I do the work and let it guide me?

With code, Claude can usually own the whole thing. It's seen a million REST APIs. It knows what they look like, knows how to build one. Give it the task and get out of the way.

But with visual work — anything where spatial awareness matters, where the eye is faster than any screenshot, where "I'll know it when I see it" is the actual specification — that's where you step in. Claude knows how. It just can't see what you're seeing fast enough to iterate at the speed the work demands. So you take the controls, and you ask it how, and it tells you.

I started that afternoon with Claude doing the work in Unity and ended it with me doing the work in Unity. Same project, same session. We just found the spot where the work needed a different arrangement, and we shifted. Knowing when to let it run, and knowing when to pick up the wrench and say, okay, talk me through this — that's the skill.

I knew what I wanted. I didn't know how to do it. That used to be the most frustrating sentence in the business. With the right partner on the line, it might be the best one.

Kevin Phifer is the founder of Theoretically Impossible Solutions LLC, specializing in agentic AI development and consulting. You can reach him at kevin.phifer@theoreticallyimpossible.org.

← Back to Blog