Benchmark Design Methodology

U

@

·

Design custom benchmarks through systematic methodology.

73 copies0 forks
Design a custom benchmark for evaluating {{model}} on {{capability_area}}.

Step 1: Define what success looks like for {{use_case}}
Step 2: Identify measurable dimensions of performance
Step 3: Create diverse test cases covering edge cases
Step 4: Establish scoring rubrics with clear criteria
Step 5: Validate benchmark against {{reference_models}}
Step 6: Document administration and scoring procedures

Explain design rationale at each step.

Details

Category

Analysis

Use Cases

Benchmark creationEvaluation designTest development

Works Best With

claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared

Create your own prompt vault and start sharing