Optimize {{model}} latency for {{performance_target}}. Step 1: Profile current latency breakdown by component Step 2: Identify bottlenecks in {{request_pipeline}} Step 3: Evaluate optimization options (caching, batching, model size) Step 4: Estimate improvement potential for each option Step 5: Assess tradeoffs (cost, quality, complexity) Step 6: Recommend optimization roadmap with expected gains Show quantitative analysis at each step.
55 copies0 forks
Details
Category
AnalysisUse Cases
Latency optimizationPerformance tuningBottleneck analysis
Works Best With
claude-opus-4.5gpt-5.2gemini-2.0-flash
Created Shared