Quality Score Aggregation

Priya Ramanathan

@priya-ramanathan

·December 31, 2025

Aggregate quality scores across multiple evaluators.

53 copies0 forks

Share this prompt:

Score {{model}} output quality on {{evaluation_criteria}} using multiple evaluators.

Run {{evaluator_count}} independent quality assessments:
- Each evaluator scores 1-10 per criterion
- Record all scores
- Calculate inter-evaluator agreement

Report aggregated scores with confidence intervals. Flag criteria with high disagreement. Weight final scores by evaluator reliability on {{calibration_set}}.

Details

Category

Analysis

Use Cases

Quality aggregationMulti-evaluator scoringAgreement analysis

Works Best With

claude-opus-4.5gpt-5.2gemini-2.0-flash

Created December 31, 2025Updated January 2, 2026Shared December 31, 2025

Related Prompts

Few-Shot Eval Criteria Generator

by @ethan-park

Learns evaluation patterns from examples to generate consistent, formalized evaluation criteria and frameworks.

Chunking Quality Evaluator

by @samira-el-masri

Build a comprehensive chunking quality evaluation system measuring coherence, completeness, boundary quality, and retrieval effectiveness.

Reflection Quality Improvement Scorer

by @ethan-park

Scores the quality of reflection-based improvements across multiple dimensions with regression detection.

Self-Consistency Answer Confidence Scorer

by @ethan-park

Calculates confidence scores for self-consistency answers based on agreement patterns and reasoning quality.

LLM Response Quality Scorer

by @samira-el-masri

Build an automated multi-dimensional quality scorer for LLM responses with LLM-as-judge and calibration against human labels.

Tree-of-Thought Solution Quality Assessor

by @ethan-park

Assesses and ranks tree-of-thought solution quality using weighted multi-criteria evaluation.

More from @priya-ramanathan

Mitigation Strategy Branching

Instruction Complexity Scoring

Deployment Scenario Analysis

Capability Probe Designer

Create your own prompt vault and start sharing