Test {{model}} robustness against {{adversarial_techniques}}. Step 1: Generate adversarial variants of {{test_prompts}} Step 2: Apply perturbations (typos, rephrasing, injections) Step 3: Measure output stability under perturbations Step 4: Identify vulnerability patterns Step 5: Calculate robustness scores per technique Step 6: Recommend hardening strategies prioritized by risk Document attack success rates and patterns.
21 copies0 forks
Details
Category
AnalysisUse Cases
Robustness testingAdversarial analysisSecurity hardening
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared