Rate hallucination severity in {{model}} responses using {{evaluation_dataset}}. Severity calibration: - Critical: {{critical_example}} (completely fabricated facts) - Moderate: {{moderate_example}} (exaggerated claims) - Minor: {{minor_example}} (slight inaccuracies) Rate all hallucinations and calculate severity distribution.
22 copies0 forks
Details
Category
AnalysisUse Cases
Hallucination ratingSeverity assessmentQuality scoring
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared