Checklists

AI Code Generation implementation checklist

This checklist provides a rigorous framework for validating AI-generated code before it reaches production. It focuses on mitigating risks associated with hallucinations, security vulnerabilities, and technical debt specific to large language model outputs.

Progress0 / 25 complete (0%)

Security and Vulnerability Mitigation

0/5

Secrets and Credential Scanning
critical
Run a tool like gitleaks or trufflehog on all AI-generated patches to ensure no placeholder API keys or hardcoded credentials were included.
Dependency Version Verification
critical
Cross-reference AI-suggested library versions against the project's lockfile or a vulnerability database (Snyk/OSV) to prevent dependency confusion attacks or use of deprecated packages.
Injection Vulnerability Audit
critical
Verify that all generated SQL queries, shell commands, or HTML outputs use parameterized inputs rather than string concatenation to prevent SQLi and XSS.
SCA and SAST Execution
critical
Run Static Analysis Security Testing (SAST) specifically on the files modified by AI to detect common patterns like unsafe deserialization or weak cryptographic algorithms.
Permission and Scope Review
recommended
Verify that AI-generated logic for authorization or file system access adheres to the principle of least privilege.

Logic and Functional Validation

0/5

Automated Unit Test Coverage
critical
Ensure every generated function has at least one corresponding unit test covering the primary success path and failure modes.
Edge Case Boundary Testing
recommended
Manually add tests for null inputs, empty strings, and maximum integer values to verify AI-generated logic handles boundary conditions correctly.
Hallucination Verification
critical
Verify that any called internal APIs or third-party methods actually exist in the current versions of the libraries used.
Loop and Recursion Analysis
critical
Inspect AI-generated loops and recursive calls to ensure exit conditions are unreachable-proof and do not cause stack overflows or infinite loops.
Integration Test Pass
recommended
Execute full integration test suites to confirm that AI-generated modules interact correctly with existing database schemas and network services.

Code Quality and Maintainability

0/5

Linter and Formatter Compliance
recommended
Run ESLint, Prettier, or language-equivalent tools to ensure AI-generated code matches the team's formatting and style guidelines.
Type Safety Check
critical
Run TypeScript compiler or MyPy to ensure AI-generated types are consistent and do not rely on 'any' or 'ignore' pragmas.
Dead Code Removal
recommended
Identify and remove unused variables, unreachable branches, or redundant imports often left behind by iterative AI prompting.
Docstring and Comment Accuracy
optional
Verify that generated comments correctly describe the logic and are not outdated residues from previous prompt iterations.
Naming Convention Audit
recommended
Ensure variable and function names follow the project's specific naming patterns (e.g., camelCase vs snake_case).

Context and Tooling Optimization

0/5

Model Context Configuration
recommended
Verify that .cursorrules or .github/copilot-instructions files are updated with the latest architectural patterns to guide the AI effectively.
Prompt Versioning
optional
For complex tasks, document the system prompt or multi-step sequence used to generate the code for future reproducibility.
External Documentation Indexing
recommended
Ensure the AI tool has access to the latest documentation versions for niche libraries via @docs or similar context-loading features.
Model Selection Verification
recommended
Confirm that the most capable model (e.g., Claude 3.5 Sonnet or GPT-4o) was used for complex logic rather than faster, smaller models.
Environment Parity Check
critical
Verify the AI was prompted with the correct runtime environment (e.g., Node 20 vs Node 16) to avoid syntax errors.

Workflow and Peer Review

0/5

AI Attribution Tagging
recommended
Label Pull Requests or commits with an 'AI-generated' tag to alert reviewers to look for specific LLM failure modes.
Mandatory Human Sign-off
critical
Require at least one human peer review for any AI-generated code that touches the data persistence or authentication layers.
Refactoring Verification
critical
When using AI for refactoring, perform a side-by-side diff to ensure no functional logic was accidentally altered during the cleanup.
Performance Benchmarking
recommended
Run a performance profile on AI-generated algorithms to ensure they do not introduce O(n^2) or worse complexity where O(n) is expected.
Legal and License Compliance
recommended
Check generated code against open-source license filters (e.g., Copilot's public code filter) to avoid GPL contamination in proprietary repos.