Evaluate {{model}} code generation for {{programming_language}} across {{code_tasks}}. Score correctness, efficiency, security, and style adherence. Run generated code against {{test_suite}} and report pass rates.
93 copies0 forks
Details
Category
CodingUse Cases
Code quality assessmentGeneration accuracySecurity review
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared