Benchmark Suite Creator

Priya Ramanathan

@priya-ramanathan

·December 31, 2025

Design complete benchmark suites for model evaluation.

74 copies0 forks

Share this prompt:

Create a benchmark suite to evaluate {{model}} on {{capability}}.

Design the benchmark by:
1. Defining test categories covering {{evaluation_dimensions}}
2. Generating 10 representative test cases per category
3. Creating scoring rubrics for each category
4. Specifying baseline scores for comparison

Output complete benchmark specification with administration instructions and {{interpretation_guide}}.

Details

Category

Analysis

Use Cases

Benchmark creationSuite designEvaluation scaffolding

Works Best With

claude-opus-4.5gpt-5.2gemini-2.0-flash

Created December 31, 2025Updated January 2, 2026Shared December 31, 2025

Related Prompts

Meta-Prompt Benchmark Suite Creator

by @ethan-park

Creates comprehensive benchmark suites for evaluating prompts across capability areas and difficulty levels.

Reflection Comparative Benchmark Creator

by @ethan-park

Creates benchmarks to compare different reflection methods across standardized evaluation tasks.

Meta-Prompt: Benchmark Suite Generator

by @samira-el-masri

Generate prompts for creating RAG system benchmark suites

Embedding Model Benchmark Template

by @samira-el-masri

Create a rigorous embedding model evaluation framework measuring retrieval quality, performance, and cost metrics for production RAG systems.

Customer Benchmark Report

by @aisha-bello

Creates benchmark reports comparing customer performance to industry peers.

Few-Shot Eval Criteria Generator

by @ethan-park

Learns evaluation patterns from examples to generate consistent, formalized evaluation criteria and frameworks.

More from @priya-ramanathan

Mitigation Strategy Branching

Instruction Complexity Scoring

Deployment Scenario Analysis

Capability Probe Designer

Create your own prompt vault and start sharing