What 40 Versions Taught Me About AI-Assisted Development

I have been building Atmos Football — a React and Firebase app for tracking 5-a-side football stats — with AI assistance for several months now. Roughly 40 versions have shipped. The app has grown from a simple leaderboard into something with Elo ratings, team generation, fitness tracking with GPS heat maps, push notifications, an in-app blog, and a subscription tier system.

Along the way, I have learned things about working with AI that nobody told me beforehand. Some of them are obvious in hindsight. Some of them I had to learn the hard way.

The plan document is the most valuable thing you produce

More valuable than the code, in many cases.

A well-structured plan — specific tasks, specific files, specific test criteria — means the implementation phase is almost mechanical. The AI reads the plan, executes the tasks, runs the build, runs the tests. When a plan is precise, the AI is remarkably reliable. When a plan is vague, the AI makes well-intentioned guesses that are architecturally wrong, and you spend longer fixing the implementation than you would have spent writing the code yourself.

I now treat the plan document the way a builder treats blueprints. You do not start pouring concrete based on a conversation about what the house should feel like. You draw the plans, review them, correct them, and then build exactly what was drawn.

Corrections must be specific, or they are useless

Early on, I would review a plan and say something like "this is not quite right" or "can you rethink the approach for section 3." The results were poor. The AI would rewrite the section but introduce new problems, because my feedback did not tell it what was actually wrong.

What works: "Point 3 should read X instead of Y. Point 7 is missing the edge case where the user has no linked account. The file path in task 5 is wrong — it should be src/firebase/firestore/events.js, not src/firebase/firestore.js."

The AI responds to the same feedback style you would use in a code review. Specific, numbered, with the correct answer included. This is not a limitation of the AI — it is how precise communication works with any collaborator.

Institutional memory prevents you from solving the same problem twice

The project's authentication system was debugged across four versions. First the popup sign-in flow broke on mobile Safari. Then the redirect flow conflicted with the Content Security Policy. Then Android native sign-in failed because the Play App Signing certificate was not registered. Then Google sign-in on Android broke again because an environment variable file was missing.

Each of those problems took hours to diagnose. I documented every failure mode, every root cause, and every fix in a troubleshooting file that the planning environment can reference. Now, when a session touches authentication, the AI has the full history. It does not re-investigate solved problems. It does not suggest approaches that failed before. The troubleshooting file saves more time than any other document in the project.

For solo developers, this is the difference between "I fixed this before but I cannot remember how" and "the fix is documented, here it is." For teams, it is the difference between knowledge living in one person's head and knowledge being available to anyone working on the project.

The AI is better at execution than judgement

It excels at implementing a well-defined plan, maintaining consistency, catching its own syntax errors, and following documented patterns. It writes commit messages in the right format every time. It references the correct feature codes. It does not drift from the style guide or forget the naming convention.

It struggles with ambiguous requirements, novel architectural decisions, and knowing when not to act. When I ask it to "improve the auth flow," I get a sprawling refactor that touches files it should not touch. When I ask it to "add a null check on line 47 of auth.js because the Google Auth plugin returns undefined on cancellation," I get exactly that.

The workflow should play to these strengths. Give it precise instructions and let it execute. Keep the ambiguous decisions — what to build, how to structure it, what trade-offs to accept — on your side.

Speed is a double-edged sword

Once a plan is approved, implementation is fast. Multi-file changes that would take hours to type, build, and verify can be completed in minutes. During one sprint, I shipped 15 versions in two to three days with no regressions.

But speed also means mistakes arrive fast. If the plan has a subtle flaw that the tests do not catch, the AI implements that flaw across every affected file in minutes. The cost of a wrong plan is higher when the implementation is fast, because more code gets written before anyone notices.

The discipline: slow down during planning, speed up during implementation. Spend the time getting the plan right. Then let the AI execute it quickly. Do not rush the planning phase because the implementation phase is fast.

Document drift is a real cost

During the rapid sprint mentioned above, I focused on shipping features and let the reference documents fall behind. By the end, the release plan was a dozen versions out of date. The memory files contained stale information. The backlog did not reflect what had actually shipped.

When I returned to the planning environment for the next feature, the AI was working from outdated context. Its plans referenced old file structures and missed recent changes. I had to spend a session just synchronising documents before I could do productive work.

The lesson: periodic synchronisation checkpoints are essential. Not continuous updates — that is too much overhead during a sprint. But a deliberate pause every few versions to bring the documents up to date. The cost of not doing this is paid later, with interest.

Not everything fits the plan-first model

Exploratory work, live debugging, and UI design iteration do not fit neatly into "write a plan, then implement." The workflow is strongest for well-defined features — "add a GPS heat map panel with these four view modes" — and weakest for open-ended investigation — "figure out why the service worker is serving stale content on some devices."

For small changes, the overhead of a formal plan exceeds the benefit. A one-line bug fix does not need a plan document with a smoke test checklist. The workflow pays off for features that touch multiple files, require architectural thought, or need to be correct across platforms.

I use the full workflow for features and structured bug fixes. I use the implementation environment directly (without a formal plan) for trivial fixes. I do exploratory debugging by hand, then write a plan for the fix once I understand the problem. Knowing when to use each mode is part of the skill.

The AI does not know what it does not know

If the project knowledge is incomplete or outdated, the AI plans and implements based on stale information. It does not flag the gap. It does not say "I notice the source bundle is from three versions ago." It produces a confident, well-structured plan that happens to reference files that have since been renamed.

The quality of the output is directly proportional to the quality of the input context. This is the most important thing I have learned, and it applies to every AI tool, not just this workflow. If you give it good context, it produces good work. If you give it bad context, it produces bad work — confidently, fluently, and at speed.

What I would tell someone starting today

Start small. Upload your codebase to a chat-based AI, along with your README and any architecture documents. Ask it to review the code and identify issues. If the review is superficial or wrong, the AI does not understand your project well enough to be useful yet — improve the context. If it catches real issues, you have the foundation for a productive workflow.

Pick a small feature from your backlog. Ask the AI to produce a plan: what files need to change, what the changes should be, what tests should verify the result. Review it as you would review a junior developer's design document. Implement it yourself if you want — the plan is valuable even if the AI never touches the code.

If the plan is good, try the full loop: plan in the chat interface, implement with an AI coding tool, review every diff, run every test. One feature, controlled scope. After that, you have concrete evidence — not opinions — about whether the workflow suits your project.

The discipline required to work well with AI turns out to be roughly the same discipline required to build software well in general: plan before you build, be specific about what you want, verify the results independently, and maintain your documentation. AI does not remove the need for that discipline. It makes the consequences of skipping it arrive faster.