LLM Request Retry Strategy

Samira El-Masri

@samira-el-masri

·December 31, 2025

Implement a comprehensive LLM retry strategy with error classification, exponential backoff, circuit breakers, and fallback chains.

34 copies0 forks

Share this prompt:

Implement a comprehensive retry strategy for LLM API requests.

## Error Types
{{error_types}}

## SLA Requirements
{{sla_requirements}}

## Budget Constraints
{{budget_constraints}}

Build the retry system:

```python
class LLMRetryStrategy:
    def classify_error(self, error: Exception) -> ErrorCategory:
        """
        Categories:
        - TRANSIENT (rate limit, timeout)
        - RECOVERABLE (malformed response)
        - PERMANENT (auth, invalid request)
        """
        pass
    
    def get_retry_config(self, category: ErrorCategory) -> RetryConfig:
        """
        Config includes:
        - Max retries
        - Backoff strategy
        - Jitter settings
        """
        pass
    
    async def execute_with_retry(self, request: LLMRequest) -> LLMResponse:
        """
        Retry with:
        - Exponential backoff
        - Circuit breaker
        - Fallback models
        """
        pass
```

Include:
- Retry budgets
- Fallback chains
- Observability hooks
- Dead letter handling