{{ theme.skipToContentLabel || 'Skip to content' }}

prompt-atlas Index

Auto-generated by scripts/build_index.py. Do not edit by hand.

Total cards: 100

Cards by direction

RAG (rag)

CardUse whenStatusTags
Answer Grounding Checker (hallucination detector)You want to detect hallucinations in a RAG answer by checking each claim against the retrieved contextstablegrounding, factuality, scoring, structured-output, eval-set
Chunk Summarizer for RetrievalYou're building a RAG index and want to store a search-friendly summary alongside (or instead of) the raw chunk textstableretrieval, generation, structured-output, synthesis
Citation Faithfulness ScorerYou want to audit whether a citation actually supports the claim it was attached tostablecitation, factuality, scoring, grounding, structured-output
Retrieved Context CompressionYour retriever returns more text than fits comfortably in the LLM's context window, or you want to focus the model's attention on the spans actually relevant to the questionstableretrieval, generation, synthesis, structured-output
Conversational Query Resolver (rewrite follow-ups as standalone)You're running RAG in a multi-turn conversation and the latest user turn references earlier context (pronouns, "that one", "what about...") — direct retrieval would failstablequery-rewriting, retrieval, structured-output, generation
HyDE — Hypothetical Answer Generator for RetrievalYou want to generate a hypothetical answer to embed and use as a search vector (HyDE technique)experimentalretrieval, query-rewriting, generation, synthesis
Multi-Source Answer Aggregator (with conflict surfacing)You have multiple retrieved sources and need to compose an answer that handles conflicts, complements, and redundancies between them — instead of cherry-picking onestablesynthesis, retrieval, citation, structured-output
Multi-hop RAG Eval Question SynthesizerYou want to generate a multi-hop QA evaluation question from two related passagesexperimentalmulti-hop, synthesis, eval-set, generation
Query Fusion (combine sub-query results into ranked set)You decomposed a query into sub-queries, retrieved separately, and now need to fuse the per-sub-query result sets into one deduplicated, ranked setstableretrieval, synthesis, structured-output, ranking
Query Rewriting and Decomposition for RetrievalYou want to split a single complex query into focused sub-queries before retrievalstablequery-rewriting, retrieval, decomposition, structured-output
Retrieval Relevance EvaluatorYou want to score whether a retrieved passage is relevant to a search querystableretrieval, scoring, eval-set, grounding
Structured RAG Output Builder (table / list / schema from evidence)The user's question implies a STRUCTURED answer (comparison table, list, fielded record) and you want the answer in that exact shape with citations, not free prosestablesynthesis, retrieval, structured-output, extraction
Time-Aware Retrieval Query RewriterYour RAG handles queries with time-relative phrases ("latest", "last quarter", "this year") and you need to resolve them into concrete time bounds before retrievalstablequery-rewriting, retrieval, structured-output, generation

Agent (agent)

CardUse whenStatusTags
API Result to User-Readable TranslatorYour agent called an API and got back structured data; you need to translate it into a user-readable answer that addresses the original questionstablegeneration, structured-output, tool-use
API Spec to Tool Catalog ConverterYou have an OpenAPI / Swagger / JSON Schema spec and want a tool catalog ready to paste into an agent loop, without hand-writing each toolstabletool-use, extraction, generation, structured-output
Budget-Aware Agent PlannerYou're running an agent under explicit token or dollar budget and need a plan that completes the goal within budget — not a maximalist plan that overshootsexperimentalplanning, decomposition, structured-output
Clarification Question AskerYour agent receives an ambiguous goal and you need it to decide whether to proceed, ask one good clarifying question, or refusestableplanning, classification, structured-output
Error Recovery Strategy (retry / abort / escalate)An agent operation just failed (tool call, API call, database query) and you need to decide whether to retry, abort the goal, or escalate to a humanstablereflection, planning, structured-output
Long-Context Trajectory Memory SummarizerYour agent's trajectory is approaching the context window limit and you need to compress earlier history into a structured memory recordstablememory, generation, structured-output, decomposition
Multi-Agent Conflict ResolverYou're orchestrating multiple agents (sub-task delegation, parallel reasoning) and they produced conflicting outputs that need reconciliation before continuingexperimentalplanning, classification, structured-output, decomposition
Plan-and-Execute Upfront PlannerYour agent's goal is predictable enough to plan upfront instead of step-by-step (linear dependencies, known sub-problems)stableplanning, decomposition, structured-output
ReAct Planner with Strict Tool Call SchemaYou want an agent to emit one strict-JSON tool call per step in a ReAct loop, with a visible reasoning summarystableplanning, tool-use, react, structured-output, decomposition
Self-Critique Reflection Step for AgentsYour agent has taken several steps and you want a meta-level "are we on track" check before continuingstablereflection, self-check, planning, structured-output
Sub-Task Delegator (multi-agent prep)A user task is too complex for one agent and you want to split it across specialized workers / agents (multi-agent foundation)experimentalplanning, decomposition, structured-output
Tool-Call Repair from Validation ErrorA tool call failed schema validation and you want to repair it before escalating to a strategy reflectionstabletool-use, structured-output, extraction
Tool Output Summarizer (compress before context)A tool returned verbose output (JSON blob, file listing, search results, long fetch response) and you want to compress it down to what the agent actually needs for the next stepstablegeneration, memory, structured-output

RLHF (rlhf)

CardUse whenStatusTags
Best-of-N Response SelectorYou have N candidate responses to the same prompt and need to pick the best (for inference-time best-of-N or for RLHF preference data)stableranking, scoring, helpfulness, harmlessness, structured-output
Constitutional Critique-and-ReviseYou want Constitutional AI training data (or to clean up a response at inference) by critiquing it against principles and revisingstableharmlessness, helpfulness, honesty, generation, structured-output, self-check
Helpfulness vs Harmlessness Tradeoff ScorerYou suspect a model's response sacrificed helpfulness for caution OR was helpful but unsafe — the inherent HHH tradeoff axis. Critical for diagnosing over-aligned / under-aligne...experimentalhelpfulness, harmlessness, scoring, structured-output
Iterative DPO Pair GeneratorYou're doing iterative DPO and need to generate (chosen, rejected) pairs targeting a specific behavioral principle, using the current model's response as the rejected baselineexperimentalpreference-labeling, pairwise, generation, helpfulness, structured-output
Long-Context Pairwise Preference LabelerYou're labeling preferences on long-form pairs (long context input + long-form responses, e.g. research summaries, long-form Q&A, multi-turn dialogue) where short-answer pairwis...experimentalpreference-labeling, pairwise, structured-output
Pairwise Preference Labeler (HHH dimensions)You have two AI responses to the same prompt and want a preference label across helpful / harmless / honest dimensionsstablepreference-labeling, pairwise, helpfulness, harmlessness, honesty, scoring
Persona Consistency JudgeYou're training or evaluating a model that should adhere to a defined persona / character / brand voice, and need to detect drift from that personastablellm-judge, scoring, structured-output, classification
Pointwise Reward Scorer (single response → reward signal)You only have a single response (no comparison pair) and need a scalar reward signal for reward model training dataexperimentalreward-modeling, scoring, helpfulness, harmlessness, honesty, structured-output
Preference Rationalization JudgeYou're auditing the quality of preference labels in your RLHF dataset and want to detect rationales that don't actually justify the labeler's pick — sign of noisy or rushed labe...experimentalllm-judge, scoring, classification, structured-output
Red-Team Prompt Generator (defensive safety probes)You're building a safety eval set or RLHF refusal-training dataset and need to probe a model's refusal behavior on a specific harm category. Defensive use only.experimentalsafety, harmlessness, generation, structured-output
Refusal Calibration ProbeYou're evaluating whether a model refuses appropriately — neither over-refusing benign requests nor under-refusing genuinely unsafe ones. Critical for RLHF refusal-training data...experimentalharmlessness, scoring, classification, structured-output
Reward Hacking DetectorYou suspect a post-RLHF model is gaming the reward signal — looking good to the reward model but providing low actual value to users. Critical pre-launch diagnosticexperimentalreward-modeling, scoring, classification, structured-output

SFT (sft)

CardUse whenStatusTags
Code SFT Pair GeneratorYou're building code SFT training data and need (instruction, response) pairs at controlled difficulty in a target languagestableinstruction-tuning, generation, data-augmentation, structured-output
Multi-Turn Conversation SFT Data GeneratorYou're training a chat model and need multi-turn conversation SFT data, not just single-turn QA pairsexperimentalinstruction-tuning, generation, structured-output, data-augmentation
SFT Data Coverage AnalyzerYou have an SFT dataset (or a sample of it) and want to know whether it covers the topics / skills it's supposed to, or whether it's lopsidedexperimentalclassification, scoring, structured-output, instruction-tuning
SFT Data Quality FilterYou have candidate (instruction, response) SFT pairs and want to filter them by quality before trainingstableinstruction-tuning, scoring, classification, structured-output, safety
Few-Shot Example Selector (pick best K demonstrations from a pool)You have a pool of (instruction, response) demonstrations and want to pick the best K for few-shot prompting a specific target querystableinstruction-tuning, classification, ranking, structured-output
SFT Instruction DeduplicatorYou have an SFT instruction dataset and want to find near-duplicates at the SEMANTIC level (paraphrases, synonymous tasks) — not just exact-string duplicatesstableclassification, instruction-tuning, structured-output
Instruction Difficulty ClassifierYou're building curriculum training data, stratifying a benchmark, or selecting active-learning candidates and need a per-instruction difficulty label calibrated to a target mod...stableclassification, scoring, instruction-tuning, structured-output
Instruction Variant Expander (seed → diverse rewrites)You want to rewrite ONE instruction into N variants that preserve the underlying task but vary surface form, register, or stylestableinstruction-tuning, seed-expansion, data-augmentation, generation
Persona-Controlled Response GeneratorYou want responses that match a specific persona / brand voice / character — for chat-product training data, multi-persona systems, or branded assistantsstableinstruction-tuning, generation, structured-output
SFT Response Generator (instruction → high-quality response)You need to produce the response half of an SFT pair given an instructionstableinstruction-tuning, generation, structured-output
Self-Instruct — Generate New Instructions from a Seed BankYou have a small bank of seed instructions and want to generate NEW instructions in the same task family (Self-Instruct technique)stableinstruction-tuning, seed-expansion, generation, data-augmentation, structured-output
Style Transfer (rewrite text in target style)You want to rewrite text into a specific style (formal/casual, terse/elaborate, persona-flavored) while controlling whether semantic meaning must be preserved exactlystablegeneration, data-augmentation, structured-output

Multimodal (multimodal)

CardUse whenStatusTags
Chart and Table ExtractorYou have an image of a chart, plot, or table (e.g. from a paper, dashboard screenshot, slide deck) and want the data as a structured objectstablevision, extraction, structured-output, vlm-eval
Diagram to Structured DataYou have an image of a diagram (flowchart, architecture, ER, sequence, etc.) and want to extract its structure as nodes + edges, not free textstablevision, extraction, structured-output
Document Layout AnalyzerYou want to understand a document page's STRUCTURE (where headers, body, tables, images live; what reading order is) — not extract specific fieldsstablevision, extraction, structured-output, vlm-eval
Handwriting Transcriber with Per-Word ConfidenceYou have an image of handwritten text (notes, forms, whiteboard, captured letters) and want a transcription with per-word confidence so downstream code can flag words for reviewexperimentalvision, ocr, extraction, structured-output
Custom-Category Image ClassificationYou want to classify images into your own custom categories (not a fixed pretrained label set) — content moderation, product catalog tagging, support-ticket image routingstablevision, classification, structured-output
Image Pair Comparison ExplainerYou have two images and want a structured explanation of their similarities and/or differences (UI A/B comparison, product photo comparison, design variant analysis)stablevision, comparative, structured-output
Image Edit Instruction Generator (before/after to instruction)You have a before/after image pair and want to generate the natural-language edit instruction that would produce the change — for image-edit-model training data, design diffs, o...experimentalvision, generation, structured-output
OCR + Structured Extraction from Document ImagesYou want to extract a fixed set of typed fields from a document image (receipt, invoice, form, ID page)stablevision, ocr, extraction, structured-output
UI Screenshot to Component SpecYou have a UI screenshot (web/mobile/wireframe) and want a structured component spec — component tree, layout, interactions — instead of free-text description or raw codeexperimentalvision, extraction, structured-output
Structured Image Caption GeneratorYou want a structured caption for an image — discrete fields like scene, subject, objects, action — instead of free-form textstablevision, image-description, generation, structured-output, extraction
VLM Image Description VerifierYou have a candidate image caption and want to audit which claims actually match the imageexperimentalvision, image-description, vlm-eval, factuality, scoring
Visual Question Answering with Grounding and ConfidenceYou want to answer a question about an image AND know whether the image actually supports the answer (with grounding region and confidence)stablevision, vlm-eval, structured-output, factuality, scoring

Chain-of-Thought (cot)

CardUse whenStatusTags
Citation-Grounded Reasoning (every claim must cite)You need reasoning where EVERY factual claim cites a source — for academic, legal, medical, or compliance contexts where unsourced claims are unacceptablestablestructured-reasoning, citation, grounding, structured-output
Contrastive Self-Consistency (compare against intentionally-wrong)A question has plausible wrong reasoning paths (common confusions, popular misconceptions) and you want the model to actively contrast its answer against the wrong one for stron...experimentalself-check, structured-reasoning, scoring, structured-output
Least-to-Most DecompositionA complex problem can be solved by breaking it into a chain of strictly easier sub-problems where each can use earlier answersstabledecomposition-cot, structured-reasoning, structured-output
Meta-Prompt Generator (generate prompts for a class of tasks)You're starting a new prompt-engineering task and want a meta-prompt template generated from a description + examples, instead of writing it from scratchexperimentalgeneration, structured-output, instruction-tuning
Plan Critique and ReviseYou've generated a plan (least-to-most decomposition, agent plan, multi-step reasoning) and want to critique-and-revise before execution to catch issues cheaplystableself-check, structured-reasoning, structured-output, decomposition-cot
Self-Consistency Aggregator (majority vote over reasoning paths)You've sampled N candidate answers to the same question (with temperature) and want to take a majority votestablestructured-reasoning, self-check, rationale-summary, structured-output
Self-Correction Protocol (accept / correct / reject)You have a candidate answer and external criticism (from another model, a human reviewer, or a rule check) and need to decide whether to accept, correct, or reject the candidatestableself-check, structured-reasoning, structured-output, classification
Step-Back Prompting (abstract first, then solve)A question's surface details might mislead direct reasoning, and reasoning from a more general principle would be more reliablestablestructured-reasoning, decomposition-cot, structured-output
Structured Reasoning with Rationale SummaryYou want the model to decompose its reasoning into named sub-steps and emit a summary rationale (not hidden chain-of-thought)stablestructured-reasoning, rationale-summary, decomposition-cot, structured-output
Tree-of-Thoughts (branch + evaluate + prune)A problem has multiple plausible reasoning paths and a single linear chain might miss the right one — combinatorial planning, search, design problems with trade-offsexperimentaldecomposition-cot, self-check, structured-reasoning, structured-output
Reasoning with Explicit Uncertainty QuantificationYou need not just an answer but a calibrated sense of which parts of the reasoning are solid vs guessed — for high-stakes decisions, scientific Q&A, or claims with downstream co...experimentalstructured-reasoning, self-check, scoring, structured-output
Verify-Then-Finalize (self-check before commit)A task is error-prone (math, units, edge cases) and you want a draft + explicit verification before committing to the final answerstableself-check, structured-reasoning, factuality, structured-output

Evaluation (eval)

CardUse whenStatusTags
Calibration Checker (predicted confidence vs actual accuracy)You have a batch of model outputs with both predicted confidence AND actual correctness labels, and you want to check whether the confidence is calibrated (high-confidence outpu...stablellm-judge, scoring, comparative, structured-output
Human Eval Study BootstrapYou're standing up a human eval study for a task and want a structured study design (rubric, annotator instructions, sample size guidance, analysis plan) instead of figuring it...experimentalllm-judge, rubric, structured-output
LLM Judge Bias Probe (length / position / format / verbosity)You're using an LLM as a judge in production and want to verify it doesn't have systemic bias on length / position / format dimensions before trusting its scoresexperimentalllm-judge, scoring, classification, structured-output
Multi-Benchmark Leaderboard BuilderYou have model results across multiple benchmarks and want a leaderboard with weighted overall ranking, per-benchmark rankings, and analysis of where models are strong / weakstablecomparative, scoring, structured-output
LLM-as-Judge Rubric for Open-Ended OutputsYou want a structured quality assessment of a single AI output across factuality / instruction-following / coherence / completenessstablellm-judge, rubric, holistic, scoring, factuality, coherence
Multi-Turn Dialogue JudgeYou're evaluating a chat model and need to judge a multi-turn conversation, not just a single responsestablellm-judge, rubric, scoring, holistic, structured-output, coherence
Pairwise Judge with Position-Bias ProbeYou're running pairwise LLM-as-judge evaluation and want to detect / control for the well-known position bias (judge prefers whichever response is shown first)stablellm-judge, pairwise, comparative, scoring, structured-output
Per-claim Factuality Judge (atomic decomposition)You want fine-grained factuality labels (true / false / unverifiable) for every atomic factual claim in an AI outputstablellm-judge, factuality, scoring, structured-output, extraction
Pointwise Quality Scorer with ConfidenceYou want to score a single AI output on YOUR custom dimensions (not a fixed rubric) with self-reported confidencestablellm-judge, scoring, holistic, structured-output, coherence
Reference-based Judge (output vs gold)You're scoring closed-form outputs (short-answer QA, math, structured extraction) against a known gold answerstablellm-judge, scoring, factuality, comparative, structured-output
Output-Level Regression DetectorYou're testing a candidate model / prompt change against a baseline and need to detect quality regressions on specific dimensions, not just an overall vibe checkstablellm-judge, scoring, comparative, structured-output
Domain-Specific Rubric GeneratorYou're starting a new evaluation task and want a structured rubric (with concrete level descriptions per dimension) instead of writing it by handexperimentalrubric, generation, structured-output
Safety Output Classifier (defensive)You want to classify whether an AI output should be allowed, reviewed, or blocked along an explicit harm taxonomy (defensive use only)stablesafety, harmlessness, classification, llm-judge, structured-output

Code (code)

CardUse whenStatusTags
API Design Reviewer (REST / GraphQL / gRPC)You're reviewing an API design (not implementation) and want structured findings on consistency, ergonomics, evolvability, security, and performance — calibrated to the API stylestablecode-review, scoring, structured-output
Code Evaluation JudgeYou're evaluating AI-generated or contributor-submitted code against a task description (and optionally a reference solution + test cases)stablellm-judge, scoring, factuality, structured-output
Code Explanation Generator (audience-aware)You want to explain a piece of code to a specific audience (new hire / PM / domain expert) at the right level — not too basic, not too jargon-heavystabledocumentation, generation, structured-output
Code Review Checklist (structured findings)You want a structured code review with per-dimension findings instead of a free-text "this looks good" replystablecode-review, scoring, structured-output, classification
Code Diff Summary for Pull RequestYou have a git diff and want a structured PR description (summary, change list, risks, test suggestions) instead of free-form prosestabledocumentation, generation, structured-output
Code Translation Across LanguagesYou want to translate code from one language to another, with explicit control over how aggressively to adopt the target language's idiomsstablegeneration, structured-output, extraction
Conventional Commit Message GeneratorYou're writing a commit message and want it generated from the diff in a specific style — conventional commits, simple imperative, or verbose with bodystabledocumentation, generation, structured-output
Dependency Impact AnalyzerYou're planning to change a function signature / API contract / shared type and want to know what breaks before you startexperimentalextraction, classification, structured-output
Error Message / Stack Trace ExplainerYou have a confusing error message / stack trace and want a structured explanation calibrated to a specific audience (junior dev / senior / PM)stabledocumentation, generation, structured-output
Code Migration Plan GeneratorYou're planning a major version migration (framework upgrade, runtime upgrade, API spec migration) and want a phased plan based on your actual code rather than generic upgrade-g...stablecode-review, generation, structured-output, decomposition
Refactor Suggestion (with rationale and diff hint)You have working code that's not optimal on a specific axis (readability, performance, testability, modularity, type safety) and want concrete refactor suggestions with rationalestablecode-review, generation, structured-output
Code Security Review (focused)You want a focused security review (not generic code review) — looking specifically for vulnerabilities given a threat modelstablecode-review, scoring, classification, structured-output
Test Case GeneratorYou want to generate test cases for a function or class with explicit coverage of happy path, edge cases, and error handlingstabletest-generation, generation, structured-output

Cards by tag

Code MIT · Prompt content CC-BY-4.0. See LICENSE.