How to get better results from AI-augmented programming
Discover how to avoid common AI coding pitfalls like context loss, hallucinated APIs, and inconsistent styles when working with agentic tools like Cursor, GitHub Copilot, and Windsurf. This hands-on guide shares effective techniques for rule-based prompting, context management with version-controlled Markdown, and building a continuously improving workflow for reliable, frustration-free results.
I’ve been working with Cursor and AI-augmented programming recently. At the same time, I’ve been experimenting with ways to get good results out of the editor. Just starting to write prompts directly without structure will work, but the results often won’t be very good. The AI agent will frequently:
Forget what it has already done previously.
Remove code that should not be removed.
Get into bad debugging loops, making the code worse without finding the right solution.
Decide to switch to a language or framework that the project is not using.
Get confused about the current task and start implementing something unrelated to it.
Hallucinate nonexistent API or library calls.
Not follow the style of the rest of the code base.
Forget it has already implemented something and implement it again, but in a different place.
The solution for better results is proper context management. While working with Cursor, I’ve found through trial and error an approach that seems to be working well enough. Cursor will still do silly things, but at least it’s less frequent. It’s best to continuously improve your prompting and context so errors become rarer, frustration decreases, and productivity increases.
This blog post gathers some of my findings so far. While I’ve based these on my experience using Cursor, they also apply to other agentic coding tools like Windsurf, GitHub Copilot, Roo Code, and others.
Create a continuously improving AI ruleset
Most agentic coding tools have some kind of a rule system:
These rules provide context about what technologies the project uses and which actions are good and which actions are bad. A good approach to rules is to start with a basic set and then expand on them when you find the agent doing something unexpected and unproductive. The agent will often make the same mistakes multiple times until you create a rule to prevent that behavior.
It’s also good to note that the agentic tools add the rules to the agent’s context on top of the agent’s system prompt, which is fixed and differs depending on the agent. While all the tools use the same large language models (LLMs) like Anthropic’s Claude or Google’s Gemini, differences in their system prompts can lead to differences in accuracy and performance. The rules are intended to make the tools work best for your style and workflow, though many of the rules are also generally applicable.
Here are snippets from actual rules files in a project I’m working on:
The AI assistant is NOT allowed to commit any code to Git under any circumstances. All Git commits must be performed by the user manually.
The AI assistant MUST ensure it is in the correct directory before executing any shell command via the terminal tool. This should be achieved by prepending an appropriate `cd <target_directory_absolute_path>` command, followed by `&&`, to the main command within a *single* terminal tool execution.
Do not add actual implementation code (like completed Dockerfiles, .devcontainer configurations, or application code) to planning files. Planning files (located in the `plans/` or `memory-bank/` directories) should only contain high-level descriptions, specifications, and task lists - never the actual code implementation.
These rules are based on the agent doing something it shouldn’t do and failing. Adding these rules has thus improved the agent's behavior and produced better results. I’ll expand on these specific rules in later tips.
It’s good to realize that you don’t need to write the rules yourself from scratch - you can ask the agent to write the rules on your behalf instead and just edit them as needed. However, you don’t want the agent to edit the rules once you’ve added them. For example, Cursor won’t allow edits to rules listed under a repository's .cursor/rules directory. This prevents the agent from changing its behavior to something you, as the user, did not intend.
Creating proper rules for the AI is the most important advice in this blog post, as your overall success depends on this the most. Most of the other tips here also rely on having this rule system in place, and I will be covering a good base set of rules in the later tips.
Store context in version-controlled Markdown
The agent won’t remember anything about your software outside the context you provide for each call. A consistent base context with the most essential details about the project and its progress helps significantly. This way, the agent won’t have to make guesses about implementation details, and it will produce consistently better output. Context in Markdown files will help the agent follow a plan, place new files correctly, and not implement things that are already implemented. When writing prompts for the AI, you can attach these Markdown files as part of the request context.
These Markdown files should document things like:
High-level overview of what the project is and what problem it solves
Tech stack used
Current implementation status: What has been done already, what should be done next
Overview of the system architecture
Repository structure
Unlike with rules for the AI, you want the agent to be able to edit these Markdown files. Your development workflow should alternate between updating these planning and context documents and modifying code. The AI agents can help with both phases. You can bounce ideas with them or ask them to turn vague concepts into concrete tasks.
By version-controlling the documents, you also share context with other developers (human ones) working on the same repository. They can use the context files for their own agent use, and as a side effect, they also provide documentation on the project’s status for humans.
This pattern of storing context in files in the repository is emerging as a best practice for AI-augmented programming. Different existing tools and rules facilitate this approach. I’ve been using Cline Memory Bank, which comes from the Cline VSCode plugin, but also works with Cursor and other agentic coding tools that implement a rules system. Another tool I’ve been considering is Taskmaster, which creates task descriptions for the AI, can analyze their complexity, and breaks down too complex tasks into smaller ones.
To give an idea of what this context looks like, I’ll show the project brief for a web app I’m working on. This project brief can be used to initialize the Cline Memory Bank, and other context files can be created in the memory bank by giving this file as context and prompting the agent to “Initialize memory bank.”:
# EcoEstate: Property Prices and Environmental Quality Correlation Map App
## Project Brief
EcoEstate is a web application that provides interactive map-based visualizations and correlations between property prices and environmental quality indicators in Finland. The application aims to help users understand how environmental factors may influence property values across different regions.
## Core Requirements
- Interactive map interface showing property prices across Finland
- Visualization of environmental quality indicators (green spaces, public transport access)
- Ability to toggle between different data layers
- Correlation analysis between property prices and environmental factors
- User-friendly experience with intuitive controls
## Goals
- Create an informative tool for property buyers, sellers, and researchers
- Provide insights into the relationship between environmental quality and property values
- Deliver a responsive, accessible web application
- Establish a foundation for future enhancements and data analysis
## Technical Stack
- Frontend: React, TypeScript, Leaflet/Mapbox GL JS
- Backend: Node.js, Express, TypeScript
- Data Sources: Statistics Finland API, HSY WMS, OpenStreetMap, Digitransit
## Project Scope
The initial MVP will focus on:
- Property price visualization by postal code/municipality
- Basic environmental indicators (green spaces, public transport)
- Simple correlation metrics
- Interactive map interface with layer toggling
Future expansions may include advanced analysis, user-generated content, real-time updates, and international coverage beyond Finland.
Set up tight feedback loops for the agent
Feedback loops are essential in all software development, and AI-augmented programming is no exception. What this means in practice is that you need to:
Make sure the agent writes tests.
Set up linting rules.
Set up checks for code complexity - the agent can write overly complex code if you let it.
Make sure the agent runs tests and reacts to the results.
With good tests in place, you can also set up a rule to make the agent practice test-driven development (TDD). TDD will ensure that the agent writes tests for upcoming code first, then writes the implementation, tests it, and refactors the code to make it clearer. You may also need to remind the agent with a rule that the word “refactoring” means no functional changes to code. You can also make the agent run linting and any other tests to provide frequent feedback that the agent can then react to immediately.
Stop the agent when it gets into a bad loop
Sometimes, despite your best efforts at setting up good feedback loops, the agent will get into a bad loop where it just can’t find the correct fix and keeps making the code worse and worse by creating “fixes” that have little to do with solving the problem. When this happens, the best thing to do is stop the agent, revert to a good state, and change your approach. Examples of changing your approach are things like:
Rewrite the prompt and add additional context.
Change to a different LLM (for example, from Claude to Gemini or vice versa).
Search for additional information using a traditional search engine.
Large language models have a vast search space for solutions, and sometimes they can get stuck in an unproductive part of that search space. In these cases, they’ll need a nudge to get to a better place. Changing to an entirely different search space by switching the model can also help.
How do you know the agent has entered a bad loop? Sometimes, it is glaringly obvious. An example I’ve seen is that the agent decides to delete large chunks of perfectly good code as a “test,” not realizing that it won’t be able to restore them without help later on. Other times, it will make up a nonsensical theory about what’s going wrong.
Other times, bad loops are not as obvious, and the code might just become overly complex through multiple troubleshooting iterations. The agent might, for example, add safety checks that seem reasonable at first, but don’t make sense when you think about them for more than a few seconds. In these situations, you need to remain critical and actually read and understand all the code generated to know when it’s time to backtrack and try again.
Don’t allow the AI agent to make commits
If you let it, the agent will happily commit broken code without running tests, refactoring, or following other standard best practices. I had to add a rule to prevent this, since the agent was eager to commit changes constantly.
Committing yourself allows you to better control what ends up in version control. You can iterate on the code as many times as you like, discarding bad results and rewriting prompts. You can easily commit it once you’re satisfied that the code is good enough.
I don’t consider this rule as absolute as some of the others. I’ve been considering setting up Git commit hooks that run tests and linters to prevent bad commits and some additional rules to ensure the agent makes good commits at the right time. For now, though, I’m still making my own commits. Future iterations of the workflow might change that.
While you might not allow commits, you’ll often want to run the same commands repeatedly, e.g., run tests. You’ll want to allowlist these commands for the agent to run without asking for permission. On the other hand, you don’t want to allow the agent to run whatever commands, because it might start doing things like deleting files, installing unwanted packages, or other such things.
Commit early and often
Frequent commits are best practice even without AI agents, but it’s even more critical when there’s an AI agent making loads of changes all at once. A commit to version control should mean that the code is vetted and good enough to be committed. The AI agent will produce a lot of code, and sometimes it will get sidetracked and make lots of bad changes quickly. These changes can be annoying, but they are easy to fix if you have a good process for backtracking and discarding bad code.
Cursor can restore the code to the state it was in before the last prompt, and you should use that feature frequently. This backtracking works best when you work in small batches and don’t try to make the agent achieve too much in one prompt. Otherwise, you might notice the agent creating something good only to erase it moments later, all within the same prompt. In addition to this backtracking feature, version control commits give you an additional safety layer for saving code you’ve deemed good.
Inject fresh documentation into the context
Large language models have a training cutoff date and don’t automatically know about things that have changed or happened since that cutoff date. Software moves fast, so there may have been several changes to the libraries, frameworks, and tools you use in your project after the cutoff date. You need to add newer information to the agent’s context separately.
Cursor has a built-in mechanism for adding documentation to the context. It has a built-in list of documentation, and you can also easily add your own documentation. There is also support for generic web search, which you can enable if you don’t get fresh information otherwise.
The Model Context Protocol (MCP) is another way to add context. In particular, an MCP server called Context7 is a powerful tool for adding fresh documentation context. It has an extensive list of up-to-date documentation converted into a format tailored for LLMs. You can configure your editor to connect to Context7, after which you can just add “Use Context7” to your prompt, and the agent will search for relevant documentation using MCP. If you want to read more about this topic, my colleague Anoop recently wrote a blog post about MCP.
Balance speed and quality
Agentic coding tools can produce so much code so quickly that it’s easy to get to a state where you don’t understand the code base very well. You can see in your development environment that the app is working correctly, but you don’t know how exactly. In other words, you’ve been vibe coding and forgot the code exists. Code quality and architecture in these situations might be fine, but it might also be atrocious. You don’t want it to be the latter for projects you mean to develop long term. Good quality code will make things easier for all developers, human and AI alike. You probably also don’t want to vibe code constantly, lest you start unlearning what good code looks like and how to produce it yourself. Writing your own code and trying to deeply understand generated code is good for keeping your skills sharp.
At the other end of the spectrum, you only use AI for fancy auto-complete and generate code in a ChatGPT window for careful copy-pasting. You’ll only ask for tiny snippets of code and keep the reins tightly in your own hands. You’ll have maximum control over the code, but also miss out on many benefits of AI assistance. In many cases, letting the AI produce larger chunks can lead to better structure, and of course, you’ll also be much more productive.
You need to find the right balance between the two extremes. The answer is not always the same, and it depends on what you’re working on at any given moment. Sometimes, you’ll want maximum output through vibe coding for a quick throwaway demo. Another time, you’ll take much more control over a critical piece of code in your software that many other parts depend on. It’s important to know what approach is best for what you are currently trying to achieve. I also wrote about this in my previous blog post about AI-augmented programming.
Risto Laurikainen is a DevOps consultant who has worked on platform engineering before it was called platform engineering. He has worked on building and using these platforms for more than a decade in various roles from architecture to team leading.