Create a prompt that generates comprehensive benchmarks for RAG systems. System Under Test: {{system_description}} Benchmark Categories: {{benchmark_types}} Success Criteria: {{evaluation_criteria}} The generated prompt should: 1. Define specific test scenarios for each category 2. Include data generation instructions 3. Specify metrics to collect 4. Provide comparison baselines 5. Include statistical significance testing Output a reusable benchmark generation prompt template.
50 copies0 forks
Details
Category
AnalysisUse Cases
Benchmark creationTesting automationEvaluation design
Works Best With
claude-sonnet-4-20250514gpt-4o
Created Shared