From Copilot to Orchestrator: My Three-Week Transformation

Three weeks ago, I was using GitHub Copilot the way most people do. I had VS Code open, the little Copilot chat window in the sidebar — the thing that looks like a frog — and I'd ask it to generate a function here, explain some code there. It was a helper. A decent one, but a helper.

Today, I have a fully orchestrated agentic development framework on its third major iteration. I've shipped multiple projects with it. One of them was a client engagement I expected to take a month — I learned the tooling and delivered the software in a week. And I'm not exaggerating when I say this has fundamentally changed how I think about building software.

This is the story of how that happened.

The Message That Started It

A buddy of mine named Mike pinged me on LinkedIn sometime around the beginning of February. Simple question: "What are you doing with AI development?"

My honest answer was: not much. I was using Copilot as a code completion tool, same as everyone else. I was hearing the same hype everyone was hearing — AI is going to take over the world, we're all out of jobs, the usual. And I was mostly tuning it out. I've been in this industry since '92, and I've heard "this changes everything" enough times to develop a healthy filter for it.

Mike, to his credit, has brought me life-changing things before. Years ago, he introduced me to Inbox Zero and Getting Things Done — personal productivity systems that genuinely transformed how I work. I taught seminars on GTD internally at companies. It was that impactful. So when Mike gets excited about something, I've learned to pay attention, even when my initial reaction is skepticism.

He started describing what he was doing with Claude Code — not as a code completion tool, but as a full development partner. He had built a whole series of skills that took a project from idea to completion. I was half-listening, half-thinking "here we go, another tangent." But there was something in what he described — the way he was interacting with the tool at every level and every phase of development — that piqued my curiosity enough for me to say, "Why don't you show me?"

He showed me. And I was amazed.

Version One: Mind-Blowing and Naive

I had a project — an email-based application, an MTA product — that I needed to build. I decided this would be my test case. Could this tooling actually help me build real software?

Some context: I've spent my career managing development teams, guiding organizations from waterfall to agile, running scrum implementations. I'm not just a coder — I know how software should be built. I knew I wanted a full CI/CD pipeline. I knew the architecture: Azure Container Apps, .NET backend, the whole stack. I had drawn out documentation of how the infrastructure would look. I had the entire product in my head.

So I uploaded everything to Claude. My architecture docs, my requirements, all of it. And Claude just… got it. It read the documentation, understood what I was laying down, helped me build out a backlog, and started writing code. Within a week, I was in testing mode with the software and had an entire CI/CD pipeline running inside GitHub, deploying to Azure.

It even helped me solve a hosting problem I hadn't anticipated. You can't open port 25 on Azure without a formal agreement with Microsoft, and this application needed to send mail as an MTA. Don't debate me on whether it needed to be that way — it did. So I asked Claude where else we could host this particular service, and it found Fly.io. Read the documentation, understood what was needed, and helped me design just that one service to run over there while everything else stayed in Azure.

Mind-blowing? Absolutely. But also naive, because I made every beginner mistake in the book.

The Lessons That Hurt

The first hard lesson was about assumptions. Claude is incredibly intuitive — it picks up on what you're putting down and runs with it. But "runs with it" can mean "makes architectural decisions you never explicitly discussed." When it came time to move a service to Fly.io, Claude casually mentioned it would need access to both the API and the database. Wait — why is this service talking to the database directly? Why isn't it going through the API?

Because I'd never told it my philosophy. In thirty years of building software, I've always maintained that services talk to APIs, not databases. If you think direct database access is a good idea in a particular case, fine — we can have that conversation. But don't just do it. I had assumed Claude would infer this from the architecture, and it hadn't. It did a fantastic job with what it knew, but I hadn't been explicit about what mattered.

The second lesson was about letting work accumulate without verification. I let Claude build the entire project — API, client, services, unit tests, integration tests — before I ever ran it. In the real world, I would never do that. We follow vertical slicing. We deliver value early and iterate. But I was learning, so I let the machine do its thing.

When I finally stopped it before the UI tests and said "let me actually see this thing run," I discovered what everyone discovers: there's a gap between code that compiles and code that works. The unit tests passed because they were testing against fakes. The integration tests ran, but they were faking infrastructure. The real application had issues that only showed up when you ran it for real.

And then I discovered what I've come to think of as the most important concept in agentic development: drift.

Drift Is Real

As Claude worked through story after story in the same session, it started forgetting things I'd told it. Not maliciously — it manages its own context window, and over time, it has to decide what's less important and let it go. Things I considered non-negotiable — architectural patterns, testing standards, naming conventions — would quietly disappear from its working memory.

I'd notice it cutting corners on something I'd been very clear about and ask why. The answer was always some variation of "my bad, you're right, I should have done that." But the damage was cumulative. It also built what I call "bridges to nowhere" — code that was clearly headed somewhere but never arrived. I found an API and a client but no services. Where were the services? Claude had written all the code to do the work but never actually built the deployable service to run it. Just… forgot to finish.

The other persistent annoyance: it kept trying to give itself co-author credit on my GitHub commits. I'd tell it to stop, it would stop for a while, and then drift would kick in and it would start doing it again. Small thing, but symptomatic of the larger problem.

After that experience, I had a conversation with Claude about what was happening. What I came to understand is that when you keep the same session running through an entire project, the context management becomes the bottleneck. The AI has to make choices about what to keep and what to drop, and it doesn't always prioritize the same things you do.

The Breakthrough: Fresh Context Per Story

This is where things changed. I moved to an orchestration model — a supervisor that manages the overall project, and individual worker agents that handle one story at a time. Each worker starts fresh: it reads all my engineering standards, understands the current state of the codebase, implements the story, and then it's done. The next story gets a new worker with a clean context.

Drift still happens within a single story — it's unavoidable — but the damage is contained. Every new story resets the clock. The standards get re-read. The patterns get re-established. The non-negotiables are non-negotiable again.

This was the single biggest insight from the entire journey: the architecture of how you interact with the AI matters as much as the AI itself.

Proving It: The Oil Filter App

Right around this time, I was visiting my best friend Kevin. I'd been evangelizing this new way of working, and he was curious but skeptical — the way I'd been skeptical a few weeks earlier. We were at Walmart getting my truck's oil changed, standing in front of the wall of oil filters — nothing but a sea of numbers and little boxes — and one of us said, "What we need is an app where you take a picture and it finds the right one."

We laughed. Then I said, "Let's build it."

Six o'clock that evening, after dinner, we sat down and hit record. We just talked — puked out every idea, every feature, every rabbit hole. What I call the discovery phase. Within an hour, we had fed that transcript to Claude and produced a clean PRD. Another hour to break that into stories and get Claude Code started on implementation.

By about ten o'clock that night, I was laying in bed monitoring progress on my phone. Claude was finishing up the development, but I wasn't about to get up and start testing. That could wait until morning.

By noon the next day, we had a fully functioning product. You could log in, take a picture with your phone, say "I'm looking for this," and the app would search for it, find it, and draw a box around it on the screen. From "what if" to working software in about 18 hours, most of which I was sleeping.

Kevin's jaw was on the floor. So was mine, a little bit. Not because the technology was magic, but because the process worked. Record the conversation. Synthesize the PRD. Break it into stories. Let the machine build. Debug together in the morning. Ship.

The Real Test: Rebuilding Years of Work

Okay, so a throwaway app built in a night is impressive but not conclusive. The real test was my data extraction project — a system I'd been building on and off for four or five years. It used Tesseract for OCR, could identify pages of forms from uploaded PDFs, extract data based on configuration, handle unknown pages by queuing them for human review, and assemble everything into structured JSON. Serious software, with probably five man-months of actual development time invested across those years.

I decided to rebuild it from scratch using the agentic approach. Not port it — rebuild it, with even more features than the original.

I woke up on a Sunday morning, dictated everything the system did and everything I wanted it to do, fed that to Claude Web for PRD refinement, handed it to Claude Code, and let it run. I had set up a Linux VM in Azure specifically for this — a clean machine with minimal tools, running in what I affectionately call "YOLO mode" (full autonomous permissions, because you don't want to give that kind of access to a machine that has your credit cards and passwords saved). If Claude had a question, it would send me a Teams message. Otherwise, it just built.

The backlog came out to about 120 stories. I was on the $100/month Claude plan by this point — I'd blown through the $20 tier almost immediately — and even then I'd hit credit limits and have to set alarms to wake up when they reset so I could start the machine again.

What had taken me five man-months over four years, I rebuilt in less than a man-week. With more features. On version three of my framework.

What I Learned

A few things crystallized during this journey that I think matter for anyone getting into agentic development:

Drift is the primary adversary. Not the AI's capability — that's remarkable. The challenge is that over time, it forgets what you told it was important. The solution is architectural: fresh contexts, clear standards, and a pattern that resets on every unit of work.

Be explicit about your philosophy. Claude is intuitive, but it's not psychic. If you have strong opinions about architecture, testing patterns, dependency management — write them down and make the AI read them at the start of every session. Don't assume it will infer your preferences from context.

The economics change the calculus. In traditional development, we iterate because developer time is expensive and we don't want to waste it building the wrong thing. In agentic development, the incremental cost of letting the machine build more is essentially zero. That doesn't mean you stop iterating — milestones and human review are still critical — but the reasons for iterating shift. You're iterating for quality and correctness, not to manage burn rate.

Your experience matters more, not less. The irony of agentic development is that the more you know about building software the right way, the more effective you are at directing the AI. My decades of scrum, architecture, CI/CD, and team leadership experience aren't obsolete — they're the reason I could go from zero to a working framework in three weeks. The AI can write code faster than any human. What it can't do is know what good looks like unless you tell it.

Something changed recently. In a conversation with Mike, we agreed that something shifted — probably around the time Opus 4.6 dropped. The tools began to genuinely plan and execute on the plan, not just respond to prompts. That's when it went from a helpful assistant to something that feels like a real development team. I don't want to overhype it, but I also don't want to undersell it. This is different.

What's Next

In the next post, I'm going to get into the specifics of the framework — the actual five-phase process I've built, from recording a conversation with Otter.ai through orchestrated development with review gates and milestone checkpoints. I'll walk through the skills system, the engineering standards repo, and the supervisor/worker pattern that solved the drift problem. If this post was the "why" and the "what happened," the next one is the "how."

I'll also talk about AgentZula — the project I'm building right now, which is an AI-powered developer productivity system that passively tracks work across Claude Code sessions and GitHub repos. It's the first project built entirely within my v3 framework, and it's a perfect case study of the process in action.

But the headline from this post is simple: in three weeks, I went from using AI as a fancy autocomplete to running a development operation that's producing software at a pace I wouldn't have believed a month ago. My recruiter buddy Josh had a client with a req for ten developers. They hired three — agentic developers — and built the entire product in six months. I believe it, because I'm living it.

The future of software development isn't coming. It's here. And it belongs to the people who are willing to learn how to work with it — not instead of their experience, but because of it.

Kevin Phifer is the founder of Theoretically Impossible Solutions LLC, specializing in agentic AI development and consulting. You can reach him at kevin.phifer@theoreticallyimpossible.org.

← Back to Blog