{{ theme.skipToContentLabel || 'Skip to content' }}

🎯 场景:VLM 评估——审计候选 caption 是否真的匹配图,把 caption 拆原子 claim 后逐条标 supported / contradicted / unverifiable,输出 hallucination_rate 和 missing_elements。需要 vision 模型。

Quick Use

Use when: You have a candidate image caption and want to audit which claims actually match the image. Fill in: {{image}} = the image to check (passed as a vision input); {{candidate_description}} = the text caption to verify. You'll get: Per-claim labels (supported / contradicted / unverifiable), a list of missing salient elements, and a hallucination rate. Output is JSON. Requires a vision-language model.

Purpose

Given an image and a candidate description, decide which claims in the description are supported by the image, which are unsupported (hallucinated), and which are missing important elements. Used during VLM evaluation to produce per-claim factuality labels rather than coarse "good / bad" judgments. Output is structured so claims can be aggregated into hallucination rate and coverage metrics.

Prompt

text
You verify whether a candidate description matches an image. Look at the
image and decompose the candidate into atomic claims, then label each claim.

Image: {{image}}

Candidate description:
{{candidate_description}}

Steps:
1. Split the candidate description into atomic claims (one fact per claim).
2. For each claim, output: "supported" (image clearly shows this), "contradicted"
   (image shows the opposite), or "unverifiable" (image neither confirms nor
   denies; e.g. claims about names, intentions, or things outside the frame).
3. Then list "missing_elements" — salient visible elements that the candidate
   failed to mention (cap at 5).

Return ONLY this JSON object:
{
  "claims": [
    {"text": "<claim 1>", "label": "supported" | "contradicted" | "unverifiable"},
    /* ... */
  ],
  "missing_elements": ["<element 1>", "<element 2>"],
  "hallucination_rate": <fraction 0.0-1.0: contradicted / total claims>,
  "decision_basis": "<one sentence, <=30 words>"
}

Example

Input:

text
image: <photo of a brown dog on a green lawn>
candidate_description: "A black dog runs across a snowy field while chasing a red ball."

Expected output:

json
{
  "claims": [
    {"text": "The dog is black", "label": "contradicted"},
    {"text": "The dog is running", "label": "unverifiable"},
    {"text": "The setting is a snowy field", "label": "contradicted"},
    {"text": "The dog is chasing a red ball", "label": "contradicted"}
  ],
  "missing_elements": ["green lawn", "brown coat color"],
  "hallucination_rate": 0.75,
  "decision_basis": "Three of four claims contradict the image, which shows a brown dog on grass."
}

Failure Modes

  • Claim atomization failure — model lumps multiple facts into one claim ("a black dog running on snow"), making per-claim labels unscorable. Mitigation: add 1–2 few-shots showing fine-grained decomposition.
  • Over-labeling unverifiable as supported — VLM is confident about inferred properties (mood, age, intent) that the image cannot actually prove. Mitigation: explicit examples of what unverifiable means (names, intentions, off-frame context).
  • Position / recency bias on long descriptions — later claims get less attention. Mitigation: cap candidate length or chunk and verify per chunk.
  • Hallucinated missing elements — model "remembers" elements that are not visible. Mitigation: cap missing_elements at 5 and audit a sample.

Tuning Notes

  • 模型差异:strong VLM (GPT-4V / Claude Vision / Gemini Pro Vision) 在 atomic claim 拆解上明显更稳;弱 VLM 倾向于做整体判断而非逐条核查。
  • 温度:0.0,verifier 角色稳定性优先。
  • 评估对:本卡通常和 generator 卡(VLM 生成 caption)成对使用,generator 用 高温,verifier 用低温。
  • 用作训练信号:把 hallucination_rate 作为 negative reward 信号是常见用法, 但要注意 verifier 自己的 hallucination 会被传染——保留人工 spot check。

Changelog

  • 0.1.0 — Initial card.

Code MIT · Prompt content CC-BY-4.0. See LICENSE.