Investigate safety incident from {{model}} producing {{harmful_output}}. Step 1: Document the exact input-output pair Step 2: Identify which safety guardrails should have triggered Step 3: Test similar prompts to determine scope Step 4: Analyze why the filter failed Step 5: Assess potential harm and exposure Step 6: Recommend immediate fixes and long-term prevention Provide evidence-based reasoning throughout.
56 copies0 forks
Details
Category
AnalysisUse Cases
Incident investigationSafety analysisPrevention planning
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared