Verify {{model}} reasoning on {{problem_set}} through multiple paths. For each problem, generate 3 independent reasoning chains: - Vary initial approach - Use different intermediate steps - Compare final answers Accept answer only if 2+ paths agree. Calculate path agreement rate. Analyze disagreement sources for {{failure_analysis}}.
81 copies0 forks
Details
Category
AnalysisUse Cases
Reasoning verificationPath consistencyAnswer validation
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared