Decompose comprehensive evaluation of {{model}} into subtasks. Break down into: 1. Accuracy evaluation subtasks - Task types to test - Metrics per task type 2. Safety evaluation subtasks - Risk categories to probe - Test case requirements 3. Performance evaluation subtasks - Latency tests - Throughput tests 4. Integration subtasks - API compatibility - Error handling For each subtask, specify inputs, outputs, and dependencies. Create execution schedule for {{timeline}}.
Model Evaluation Breakdown
Break down comprehensive model evaluation into subtasks.
70 copies0 forks
Share this prompt:
Details
Category
AnalysisUse Cases
Evaluation decompositionTask planningProject breakdown
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Updated Shared