Production Health Check System

U

@

·

Design comprehensive health checks for AI infrastructure covering embedding services, vector DBs, and LLM APIs with aggregated status.

61 copies0 forks
Design a comprehensive health check system for AI infrastructure.

## Components to Monitor
{{components}}

## Health Criteria
{{health_criteria}}

## Alert Thresholds
{{alert_thresholds}}

Implement health checks:

```python
class AIHealthChecker:
    def check_embedding_service(self) -> HealthStatus:
        """
        Verify:
        - Latency within bounds
        - Model loaded correctly
        - Quality baseline met
        """
        pass
    
    def check_vector_db(self) -> HealthStatus:
        """
        Verify:
        - Connection health
        - Query latency
        - Index status
        """
        pass
    
    def check_llm_api(self) -> HealthStatus:
        """
        Verify:
        - API availability
        - Rate limit headroom
        - Response quality
        """
        pass
    
    def aggregate_health(self) -> SystemHealth:
        """Overall system status"""
        pass
```

Include:
- Kubernetes readiness/liveness probes
- Dependency health aggregation
- Graceful degradation signals

Details

Category

Coding

Use Cases

Health monitoringSystem reliabilityProduction operations

Works Best With

claude-sonnet-4-20250514gpt-4o
Created Shared

Create your own prompt vault and start sharing

Production Health Check System | Promptsy