Generate evaluation prompts to assess {{model}} on {{task_type}}. Analyze the task requirements and generate: 1. 5 probing questions that test core capabilities 2. 3 edge case prompts that reveal limitations 3. 2 adversarial prompts that test robustness For each generated prompt, explain what it tests and how to score the response against {{success_criteria}}.
0 copies0 forks
Details
Category
AnalysisUse Cases
Prompt generationEvaluation designTest creation
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared