100 Prompt Engineering resources for developers
This resource guide provides developers with the specific patterns, evaluation frameworks, and implementation tools required to move beyond basic chat interfaces and build reliable, production-grade LLM applications. It focuses on systematic prompt engineering, structured output enforcement, and regression testing.
Core Prompting Patterns and Techniques
- 1
Chain-of-Thought (CoT) Reasoning
beginnerhighInstruct the model to 'think step-by-step' before providing a final answer to improve performance on logic and math tasks.
- 2
Few-Shot In-Context Learning
beginnerhighProvide 3-5 high-quality examples of input-output pairs within the prompt to define specific formatting and stylistic requirements.
- 3
XML Tagging for Context Separation
beginnerstandardUse tags like <context>, <instruction>, and <examples> to help models like Claude 3.5 Sonnet distinguish between metadata and task data.
- 4
Chain-of-Verification (CoVe)
advancedmediumA multi-step process where the model generates a response, identifies its own potential errors, and verifies facts before final output.
- 5
Skeleton-of-Thought (SoT)
intermediatemediumPrompt the model to generate a high-level outline first, then expand each section to reduce latency and improve structural coherence.
- 6
Negative Constraints and Guardrails
beginnerstandardExplicitly list forbidden words, topics, or formats (e.g., 'Do not use markdown') to prevent unwanted output behavior.
- 7
Persona-Based System Prompts
beginnerstandardDefine a specific professional role (e.g., 'You are a Senior Site Reliability Engineer') to prime the model for specific technical jargon and tone.
- 8
Self-Consistency Sampling
advancedhighGenerate multiple outputs for the same prompt and use a majority vote or LLM-as-a-judge to select the most accurate result.
- 9
Dynamic Context Window Management
intermediatehighUsing RAG or sliding windows to pass only the most relevant 20-30% of document context to avoid the 'lost in the middle' phenomenon.
- 10
Output Formatting via Pydantic
intermediatehighForce the model to return valid JSON by providing a strict schema, often implemented via libraries like Instructor.
Evaluation and Observability Tools
- 1
Promptfoo CLI
intermediatehighA matrix testing tool that allows you to run prompts against multiple models and check outputs against predefined assertions.
- 2
LangSmith (LangChain)
intermediatehighA platform for tracing, debugging, and evaluating LLM applications with integrated dataset management for regression testing.
- 3
Braintrust
advancedhighAn enterprise-grade stack for logging LLM calls, managing prompt versions, and running automated evaluation workflows.
- 4
DeepEval (Pytest for LLMs)
intermediatemediumA Python framework for unit testing LLM outputs based on metrics like faithfulness, relevance, and hallucination scores.
- 5
Helicone
beginnerstandardAn open-source observability proxy that tracks costs, latency, and token usage for OpenAI and Anthropic requests.
- 6
PromptLayer
beginnermediumA middleware for logging and versioning prompts, allowing developers to roll back to previous prompt iterations without code changes.
- 7
Giskard
advancedmediumAn open-source library for detecting vulnerabilities, hallucinations, and biases in LLM-based applications.
- 8
Honeycomb LLM Observability
advancedstandardDistributed tracing for LLMs to understand the latency impact of each step in a complex multi-agent chain.
- 9
Weights & Biases Prompts
intermediatestandardA suite of tools for visualizing and inspecting the execution flow of LLM chains and pipelines.
- 10
Arize Phoenix
advancedmediumOpen-source observability for RAG and LLM applications, focusing on embedding visualization and retrieval evaluation.
Structured Output and Tooling
- 1
Instructor Library
beginnerhighThe industry standard for getting structured data (JSON) from LLMs using Pydantic models for Python or Zod for TS.
- 2
Vercel AI SDK (generateObject)
beginnerhighA unified interface for generating type-safe JSON objects across OpenAI, Anthropic, and Google Gemini models.
- 3
Outlines (by .txt)
advancedhighA library for neural text generation that uses regex and context-free grammars to guarantee 100% valid JSON or code.
- 4
LMQL (Language Model Query Language)
advancedmediumA programming language that combines logic programming with LLM generation to constrain output at the token level.
- 5
Guidance (Microsoft)
intermediatemediumA template language for controlling LLMs, allowing you to interleave generation, control flow, and fixed strings.
- 6
Marvin AI
beginnerstandardA lightweight library that uses LLMs to power standard Python functions, handling the prompt engineering under the hood.
- 7
TypeChat
intermediatestandardMicrosoft's library for replacing complex prompt engineering with schema-based type definitions to guide model output.
- 8
Jsonformer
intermediatemediumA library specifically designed to fill in the values of a JSON schema while the structure is pre-defined, saving tokens.
- 9
SGLang
advancedhighA fast backend and frontend for structured generation, optimized for high-throughput serving and complex control flow.
- 10
Portkey AI Gateway
intermediatemediumA control plane to manage multiple LLMs with built-in retries, fallbacks, and load balancing across providers.