Build a synthetic data generator for testing RAG systems. ## Domain {{domain_description}} ## Test Scenarios {{test_scenarios}} ## Data Requirements {{data_requirements}} Create a comprehensive generator: ```python class SyntheticRAGDataGenerator: def generate_documents(self, count: int, config: DocConfig) -> List[Document]: """Generate realistic documents""" pass def generate_qa_pairs(self, documents: List[Document], count: int) -> List[QAPair]: """Generate question-answer pairs with citations""" pass def generate_edge_cases(self, scenario: str) -> List[TestCase]: """Generate challenging test cases""" pass ``` Edge cases to generate: - Multi-hop reasoning queries - Ambiguous questions - Out-of-scope queries - Adversarial inputs - Long-tail topics Include quality validation for generated data.
Synthetic Data Generator for RAG Testing
U
@
Build a synthetic data generator for RAG testing covering document generation, QA pairs, and edge cases with quality validation.
94 copies0 forks
Details
Category
CodingUse Cases
Test data generationRAG testingQuality assurance
Works Best With
claude-sonnet-4-20250514gpt-4o
Created Shared