codeintelligently
Back to posts
AI & Code Quality

The AI Code Review Checklist Every Team Needs

Vaibhav Verma
6 min read
aicode-reviewchecklistbest-practicessecurityengineering-process

The AI Code Review Checklist Every Team Needs

I've reviewed over 500 AI-assisted PRs in the past year. Every team I've worked with asked for the same thing: a concrete checklist they can pin to their monitors and use during code review. This is that checklist.

It's not theoretical. Every item on this list caught a real bug or quality issue in a real codebase. I've organized it by category and priority, so you can adapt it to your team's specific needs.

How to Use This Checklist

Don't try to check every item on every PR. That defeats the purpose. Instead:

  1. Always check the Critical items (they catch the highest-impact bugs)
  2. Check Architecture items when the PR adds new files or patterns
  3. Check Performance items when the PR touches data fetching, loops, or database queries
  4. Check Testing items when the PR includes test files

Time target: 10-15 minutes per PR using this checklist.

The Checklist

Critical (Check Every PR)

markdown
## Critical Security & Correctness
- [ ] No hardcoded secrets, API keys, or credentials
- [ ] Authentication check present on all protected endpoints
- [ ] Authorization check present (not just authentication)
- [ ] User input validated and sanitized before use
- [ ] No SQL/NoSQL injection vectors (parameterized queries only)
- [ ] Error responses don't leak internal details (stack traces, DB schema)
- [ ] Sensitive data not logged (passwords, tokens, PII)

These seven items catch the most dangerous AI mistakes. AI regularly generates code that skips auth checks, logs sensitive data, or uses string interpolation in database queries. I've seen each of these in production codebases.

Example catch: AI generated an API endpoint that checked if the user was logged in but didn't check if they had permission to access the specific resource. Classic IDOR vulnerability.

typescript
// AI generated (vulnerable)
app.get("/api/invoices/:id", requireAuth, async (req, res) => {
  const invoice = await prisma.invoice.findUnique({
    where: { id: req.params.id },
  });
  return res.json(invoice);
});

// Fixed (checks ownership)
app.get("/api/invoices/:id", requireAuth, async (req, res) => {
  const invoice = await prisma.invoice.findUnique({
    where: { id: req.params.id, userId: req.user.id },
  });
  if (!invoice) return res.status(404).json({ error: "Not found" });
  return res.json(invoice);
});

Architecture (Check When Adding New Files/Patterns)

markdown
## Architecture & Patterns
- [ ] Follows existing project error handling pattern
- [ ] Uses established data access layer (not direct DB calls in routes)
- [ ] Imports use project aliases (@/) not deep relative paths
- [ ] No new dependencies added without team approval
- [ ] File placed in correct directory per project structure
- [ ] No duplicate logic (search codebase for similar functions)
- [ ] Follows existing naming conventions (files, variables, functions)
- [ ] Uses existing shared utilities instead of reimplementing

The "no duplicate logic" check is crucial. AI doesn't know that formatCurrency() already exists in your utils folder. It'll generate a new one every time. I once found 11 implementations of email validation in a single codebase, each slightly different.

Edge Cases (Check When Touching Business Logic)

markdown
## Edge Cases & Error Handling
- [ ] Null/undefined inputs handled explicitly
- [ ] Empty arrays and empty strings considered
- [ ] Boundary conditions tested (exactly at limit, one above, one below)
- [ ] Network failure handling present (timeouts, retries, fallbacks)
- [ ] Concurrent access considered (race conditions, double submissions)
- [ ] Errors propagated correctly (not swallowed with console.log)
- [ ] Partial failure scenarios handled (what if step 2 of 3 fails?)

Performance (Check When Touching Data Layer)

markdown
## Performance
- [ ] No N+1 query patterns (use include/join instead of loops)
- [ ] Database queries use appropriate indexes
- [ ] Large datasets paginated (no unbounded findMany/SELECT *)
- [ ] Promise.all used for independent async operations
- [ ] No blocking operations in request handlers
- [ ] Response payload size reasonable (no over-fetching)

The N+1 pattern is AI's favorite mistake. It looks clean:

typescript
// AI loves this pattern (N+1 queries)
const users = await prisma.user.findMany();
const usersWithPosts = await Promise.all(
  users.map(async (user) => ({
    ...user,
    posts: await prisma.post.findMany({ where: { authorId: user.id } }),
  }))
);

// What it should be (1 query)
const usersWithPosts = await prisma.user.findMany({
  include: { posts: true },
});

Testing (Check When PR Includes Tests)

markdown
## Test Quality
- [ ] Tests verify behavior, not implementation details
- [ ] Edge cases from above have corresponding tests
- [ ] Error/failure paths tested (not just happy path)
- [ ] Mocks are minimal (only external dependencies mocked)
- [ ] Assertions are specific (not just "toBeDefined")
- [ ] Test descriptions clearly state expected behavior
- [ ] No tests that mirror the implementation logic

The Quick-Reference Card

Print this and keep it next to your monitor:

AI CODE REVIEW QUICK CHECK
===========================
1. Auth + AuthZ present?
2. Input validated?
3. Follows our patterns?
4. Duplicate logic?
5. Edge cases handled?
6. Errors propagated?
7. N+1 queries?
8. Tests verify behavior?
===========================
If any "no": request changes.

Integrating the Checklist Into Your Workflow

Option 1: PR Template

Add the checklist to your GitHub PR template so every reviewer sees it:

markdown
<!-- .github/pull_request_template.md -->
## AI Code Review Checklist
(Check items you've verified. Leave unchecked items with a note.)

- [ ] No hardcoded secrets or credentials
- [ ] Auth and authorization checks present
- [ ] Follows existing codebase patterns
- [ ] Edge cases handled
- [ ] Tests verify behavior, not implementation

Option 2: Automated Checks

Automate what you can. Custom ESLint rules can catch pattern violations, missing auth checks, and N+1 patterns. Save human review time for the things that require judgment.

Option 3: Review Rotation

Assign an "AI quality reviewer" each sprint. This person does a deep review of all AI-assisted PRs using the full checklist. Other reviewers use the quick-reference card.

What I Got Wrong

I originally made this checklist 30 items long. Nobody used it because it took too long. The version above is the result of cutting it down to the items that actually caught bugs. Every item I removed was either too rare to justify the review time or better caught by automated tooling.

The lesson: a checklist that gets used at 80% thoroughness beats a perfect checklist that gets skipped.

Start with the Critical section. If that's all your team adopts, you'll catch 60% of AI code quality issues. Add sections as the habit builds.

$ ls ./related

Explore by topic