Response Streaming Architecture

U

@

·

Design complete streaming architecture for LLM applications covering server streaming, transport, client handling, and observability.

18 copies0 forks
Design a response streaming architecture for LLM applications.

## Application Requirements
{{application_requirements}}

## Client Types
{{client_types}}

## Infrastructure
{{infrastructure}}

Design the architecture:

**Server-Side Streaming**
- LLM API streaming integration
- Token buffering strategy
- Backpressure handling
- Error recovery mid-stream

**Transport Layer**
- SSE vs WebSocket choice
- Connection management
- Reconnection handling
- Load balancer configuration

**Client-Side Handling**
- Incremental rendering
- State management
- Progress indication
- Error display

**Observability**
- Stream metrics
- Token-level timing
- Failure tracking
- User experience metrics

Provide:
- Architecture diagram
- Implementation code
- Configuration examples
- Monitoring setup

Details

Category

Coding

Use Cases

Streaming architectureReal-time responsesUser experience

Works Best With

claude-sonnet-4-20250514gpt-4o
Created Shared

Create your own prompt vault and start sharing