Benchmark Designer Test Creation

U

@

·

Design benchmarks from test designer perspective.

33 copies0 forks
As a Benchmark Designer, create evaluation suite for {{model}} on {{capability}}.

Benchmark design:
- Test case diversity across {{difficulty_levels}}
- Ground truth generation methodology
- Scoring rubric development
- Statistical validity requirements
- Administration protocol

Provide benchmark specification with validation criteria.

Details

Category

Analysis

Use Cases

Benchmark designTest creationEvaluation development

Works Best With

claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared

Create your own prompt vault and start sharing