Design a comprehensive health check system for AI infrastructure. ## Components to Monitor {{components}} ## Health Criteria {{health_criteria}} ## Alert Thresholds {{alert_thresholds}} Implement health checks: ```python class AIHealthChecker: def check_embedding_service(self) -> HealthStatus: """ Verify: - Latency within bounds - Model loaded correctly - Quality baseline met """ pass def check_vector_db(self) -> HealthStatus: """ Verify: - Connection health - Query latency - Index status """ pass def check_llm_api(self) -> HealthStatus: """ Verify: - API availability - Rate limit headroom - Response quality """ pass def aggregate_health(self) -> SystemHealth: """Overall system status""" pass ``` Include: - Kubernetes readiness/liveness probes - Dependency health aggregation - Graceful degradation signals
Production Health Check System
U
@
Design comprehensive health checks for AI infrastructure covering embedding services, vector DBs, and LLM APIs with aggregated status.
61 copies0 forks
Details
Category
CodingUse Cases
Health monitoringSystem reliabilityProduction operations
Works Best With
claude-sonnet-4-20250514gpt-4o
Created Shared