Rank {{model_list}} on {{task_type}} using pairwise comparisons. Comparison examples: - Task: {{example_task}} Model A output: {{output_a}} | Model B output: {{output_b}} → Winner: Model A (reason: {{reason}}) Complete all pairwise comparisons and produce final rankings with {{ranking_criteria}}.
24 copies0 forks
Details
Category
AnalysisUse Cases
Model rankingComparative evaluationPerformance ordering
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared