Verify {{model}} claims in {{response_set}} using multiple passes. Pass 1: Check factual claims against {{knowledge_base}} Pass 2: Check logical consistency within response Pass 3: Check for unsupported extrapolations Flag claims where 2+ passes identify issues. Calculate consensus hallucination rate. Report high-confidence hallucinations with evidence from multiple passes.
40 copies0 forks
Details
Category
AnalysisUse Cases
Hallucination verificationMulti-pass detectionConfidence scoring
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared