Latency Breakdown Analyzer

U

@

·

Systematically analyze end-to-end latency in ML pipelines to identify bottlenecks and prioritize optimization efforts by impact and implementation effort.

40 copies0 forks
Analyze this end-to-end latency breakdown and identify optimization opportunities.

## Request Trace
{{trace_data}}

## Latency Breakdown
- Embedding generation: {{embed_latency}}ms
- Vector search: {{search_latency}}ms
- Context assembly: {{context_latency}}ms
- LLM inference: {{llm_latency}}ms
- Post-processing: {{post_latency}}ms

## Target Latency
{{target_latency}}ms

For each component:

1. **Current vs Optimal**: What is the theoretical minimum?
2. **Bottleneck Analysis**: What causes the current latency?
3. **Quick Wins**: Optimizations achievable in <1 week
4. **Medium-term**: Optimizations requiring 2-4 weeks
5. **Strategic**: Architectural changes for long-term gains

Prioritize by impact/effort ratio.

Details

Category

Analysis

Use Cases

Latency analysisPerformance optimizationBottleneck detection

Works Best With

claude-sonnet-4-20250514gpt-4o
Created Shared

Create your own prompt vault and start sharing

Latency Breakdown Analyzer | Promptsy