Image by Author
# Introduction
AI coding tools are getting impressively good at writing Python code that works. They can build entire applications and implement complex algorithms in minutes. However, the code AI generates is often a pain to maintain.
If you are using tools like Claude Code, GitHub Copilot, or Cursor’s agentic mode, you have probably experienced this. The AI helps you ship working code fast, but the cost shows up later. You have likely refactored a bloated function just to understand how it works weeks after it was generated.
The problem isn’t that AI writes bad code — though it sometimes does — it is that AI optimizes for “working now” and completing the requirements in your prompt, while you need code that is readable and maintainable in the long term. This article shows you how to bridge this gap with a focus on Python-specific strategies.
# Avoiding the Blank Canvas Trap
The biggest mistake developers make is asking AI to start from scratch. AI agents work best with constraints and guidelines.
Before you write your first prompt, set up the basics of the project yourself. This means choosing your project structure — installing your core libraries and implementing a few working examples — to set the tone. This might seem counterproductive, but it helps with getting AI to write code that aligns better with what you need in your application.
Start by building a couple of features manually. If you are building an API, implement one full endpoint yourself with all the patterns you want: dependency injection, proper error handling, database access, and validation. This becomes the reference implementation.
Say you write this first endpoint manually:
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.orm import Session
router = APIRouter()
# Assume get_db and User model are defined elsewhere
async def get_user(user_id: int, db: Session = Depends(get_db)):
user = db.query(User).filter(User.id == user_id).first()
if not user:
raise HTTPException(status_code=404, detail=”User not found”)
return user
When AI sees this pattern, it understands how we handle dependencies, how we query databases, and how we handle missing records.
The same applies to your project structure. Create your directories, set up your imports, and configure your testing framework. AI should not be making these architectural decisions.
# Making Python’s Type System Do the Heavy Lifting
Python’s dynamic typing is flexible, but that flexibility becomes a liability when AI is writing your code. Make type hints essential guardrails instead of a nice-to-have in your application code.
Strict typing catches AI mistakes before they reach production. When you require type hints on every function signature and run mypy in strict mode, the AI cannot take shortcuts. It cannot return ambiguous types or accept parameters that might be strings or might be lists.
More importantly, strict types force better design. For example, an AI agent trying to write a function that accepts data: dict can make many assumptions about what is in that dictionary. However, an AI agent writing a function that accepts data: UserCreateRequest where UserCreateRequest is a Pydantic model has exactly one interpretation.
# This constrains AI to write correct code
from pydantic import BaseModel, EmailStr
class UserCreateRequest(BaseModel):
name: str
email: EmailStr
age: int
class UserResponse(BaseModel):
id: int
name: str
email: EmailStr
def process_user(data: UserCreateRequest) -> UserResponse:
pass
# Rather than this
def process_user(data: dict) -> dict:
pass
Use libraries that enforce contracts: SQLAlchemy 2.0 with type-checked models and FastAPI with response models are excellent choices. These are not just good practices; they are constraints that keep AI on track.
Set mypy to strict mode and make passing type checks non-negotiable. When AI generates code that fails type checking, it will iterate until it passes. This automatic feedback loop produces better code than any amount of prompt engineering.
# Creating Documentation to Guide AI
Most projects have documentation that developers ignore. For AI agents, you need documentation they actually use — like a README.md file with guidelines. This means a single file with clear, specific rules.
Create a CLAUDE.md or AGENTS.md file at your project root. Do not make it too long. Focus on what is unique about your project rather than general Python best practices.
Your AI guidelines should specify:
- Project structure and where different types of code belong
- Which libraries to use for common tasks
- Specific patterns to follow (point to example files)
- Explicit forbidden patterns
- Testing requirements
Here is an example AGENTS.md file:
# Project Guidelines
## Structure
/src/api – FastAPI routers
/src/services – business logic
/src/models – SQLAlchemy models
/src/schemas – Pydantic models
## Patterns
– All services inherit from BaseService (see src/services/base.py)
– All database access goes through repository pattern (see src/repositories/)
– Use dependency injection for all external dependencies
## Standards
– Type hints on all functions
– Docstrings using Google style
– Functions under 50 lines
– Run `mypy –strict` and `ruff check` before committing
## Never
– No bare except clauses
– No type: ignore comments
– No mutable default arguments
– No global state
The key is being specific. Do not simply say “follow best practices.” Point to the exact file that demonstrates the pattern. Do not only say “handle errors properly;” show the error handling pattern you want.
# Writing Prompts That Point to Examples
Generic prompts produce generic code. Specific prompts that reference your existing codebase produce more maintainable code.
Instead of asking AI to “add authentication,” walk it through the implementation with references to your patterns. Here is an example of such a prompt that points to examples:
Implement JWT authentication in src/services/auth_service.py. Follow the same structure as UserService in src/services/user_service.py. Use bcrypt for password hashing (already in requirements.txt).
Add authentication dependency in src/api/dependencies.py following the pattern of get_db.
Create Pydantic schemas in src/schemas/auth.py similar to user.py.
Add pytest tests in tests/test_auth_service.py using fixtures from conftest.py.
Notice how every instruction points to an existing file or pattern. You are not asking AI to build out an architecture; you are asking it to apply what you need to a new feature.
When the AI generates code, review it against your patterns. Does it use the same dependency injection approach? Does it follow the same error handling? Does it organize imports the same way? If not, point out the discrepancy and ask it to align with the existing pattern.
# Planning Before Implementing
AI agents can move fast, which can occasionally make them less useful if speed comes at the expense of structure. Use plan mode or ask for an implementation plan before any code gets written.
A planning step forces the AI to think through dependencies and structure. It also gives you a chance to catch architectural problems — such as circular dependencies or redundant services — before they are implemented.
Ask for a plan that specifies:
- Which files will be created or modified
- What dependencies exist between components
- Which existing patterns will be followed
- What tests are needed
Review this plan like you would review a design document. Check that the AI understands your project structure. Verify it is using the right libraries and confirm it is not reinventing something that already exists.
If the plan looks good, let the AI execute it. If not, correct the plan before any code gets written. It is easier to fix a bad plan than to fix bad code.
# Asking AI to Write Tests That Actually Test
AI is great and super fast at writing tests. However, AI is not efficient at writing useful tests unless you are specific about what “useful” means.
Default AI test behavior is to test the happy path and nothing else. You get tests that verify the code works when everything goes right, which is exactly when you do not need tests.
Specify your testing requirements explicitly. For every feature, require:
- Happy path test
- Validation error tests to check what happens with invalid input
- Edge case tests for empty values, None, boundary conditions, and more
- Error handling tests for database failures, external service failures, and the like
Point AI to your existing test files as examples. If you have good test patterns already, AI will write useful tests, too. If you do not have good tests yet, write a few yourself first.
# Validating Output Systematically
After AI generates code, do not just check if it runs. Run it through a checklist.
Your validation checklist should include questions like the following:
- Does it pass mypy strict mode
- Does it follow patterns from existing code
- Are all functions under 50 lines
- Do tests cover edge cases and errors
- Are there type hints on all functions
- Does it use the specified libraries correctly
Automate what you can. Set up pre-commit hooks that run mypy, Ruff, and pytest. If AI-generated code fails these checks, it does not get committed.
For what you cannot automate, you will spot common anti-patterns after reviewing enough AI code — such as functions that do too much, error handling that swallows exceptions, or validation logic mixed with business logic.
# Implementing a Practical Workflow
Let us now put together everything we have discussed thus far.
You start a new project. You spend time setting up the structure, choosing and installing libraries, and writing a couple of example features. You create CLAUDE.md with your guidelines and write specific Pydantic models.
Now you ask AI to implement a new feature. You write a detailed prompt pointing to your examples. AI generates a plan. You review and approve it. AI writes the code. You run type checking and tests. Everything passes. You review the code against your patterns. It matches. You commit.
Total time from prompt to commit may only be around 15 minutes for a feature that would have taken you an hour to write manually. But more importantly, the code you get is easier to maintain — it follows the patterns you established.
The next feature goes faster because AI has more examples to learn from. The code becomes more consistent over time because every new feature reinforces the existing patterns.
# Wrapping Up
With AI coding tools proving super useful, your job as a developer or a data professional is changing. You are now spending less time writing code and more time on:
- Designing systems and choosing architectures
- Creating reference implementations of patterns
- Writing constraints and guidelines
- Reviewing AI output and maintaining the quality bar
The skill that matters most is not writing code faster. Rather, it is designing systems that constrain AI to write maintainable code. It is knowing which practices scale and which create technical debt. I hope you found this article helpful even if you do not use Python as your programming language of choice. Let us know what else you think we can do to keep AI-generated Python code maintainable. Keep exploring!
Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she’s working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.

