Health Insurance Member Q&A Chatbot
Full-stack conversational AI • React + Node.js + RAG • Healthcare domain • AWS deployment
Quick Stats
Overview
A full-stack conversational AI system helping health insurance members answer questions about coverage, claims, benefits, and enrollment. Combines a React frontend, Node.js backend, and RAG pipeline to deliver accurate, compliant responses in real-time.
Problem Statement
- Support Volume: 50K+ member questions monthly; current support team is overloaded
- Response Time: Average wait time: 24 hours. Members need instant answers.
- Cost: Support agent fully-loaded cost: $75K/year per agent. Need to reduce per-query cost.
- Compliance: Healthcare domain requires PII handling, regulatory accuracy, clear disclaimers
- Consistency: Agents give different answers to same question. System must be consistent.
Solution: Conversational AI on AWS
- Frontend: React app with real-time chat UI, typing indicators, message history
- Backend: Node.js + Express, REST API, WebSocket support for real-time updates
- AI Layer: RAG pipeline (LangChain + Pinecone + GPT-4) for member question answering
- Infrastructure: AWS Lambda, API Gateway, DynamoDB for scalability
Architecture
System Design
┌─────────────────┐
│ React Frontend│ (Chat UI, message history, typing indicators)
└────────┬────────┘
│ WebSocket
▼
┌─────────────────────────────────────┐
│ Node.js/Express Backend API │
│ (Request validation, auth, logging) │
└────────┬────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ RAG Pipeline (LangChain) │
│ 1. Vector retrieval (Pinecone) │
│ 2. Generate response (GPT-4) │
│ 3. Guardrails (PII detection) │
└────────┬────────────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ AWS Infrastructure │
│ • Lambda (compute) │
│ • DynamoDB (conversation history) │
│ • CloudWatch (logging/monitoring) │
└─────────────────────────────────────┘
Tech Stack
Key Technical Decisions
1. Full-Stack Implementation (React + Node.js)
Why: Building chatbot UI from scratch saves $100K+ in third-party services (Intercom, Drift, etc.). Node.js TypeScript provides type safety and code sharing between frontend and backend.
2. WebSocket for Real-Time Responses
Why: HTTP polling causes delays and high server load. WebSocket enables streaming responses (show text as it's generated), improving perceived performance.
3. RAG for Domain-Specific Knowledge
Why: GPT-4 alone hallucinates about plan benefits. RAG retrieves plan documents, ensuring accuracy. Reduces hallucination from 15% to <2%.
4. PII Detection & Redaction
Why: Members ask about personal claims data. System detects PII (member ID, SSN, dates) and redacts from responses per HIPAA.
5. Guardrails: Conservative Response Strategy
Why: In healthcare, saying "I don't know" is better than guessing. System flags uncertain responses and offers human agent escalation.
Key Features
For Members
- 24/7 Availability: Instant answers anytime, no wait times
- Clear Answers: Plain English explanations of coverage, benefits, claims
- Escalation: Unclear answers are escalated to human agent with context
- Privacy: PII is handled securely; responses don't leak personal data
- Conversation History: Members can revisit past questions
For Operations
- Reduced Support Load: System handles 60% of routine questions (coverage, claims, enrollment)
- Cost Reduction: $0.04/query vs. $24 for agent handling (~600x cheaper)
- Escalation Insights: Track which topics confuse members; update plan docs
- Compliance Audit Trail: All queries logged for regulatory review
- Monitoring Dashboard: Accuracy, satisfaction, response time metrics
Results & Impact
Quantified Metrics
| Metric | Value |
|---|---|
| Response Accuracy | 92% (on gold standard test set) |
| User Satisfaction | 4.2/5.0 (200+ user ratings) |
| Hallucination Rate | <2% (continuous monitoring) |
| Query Cost | $0.04 (vs. $24 agent cost) |
| Response Time | <2 seconds |
| Questions Handled | 60% of routine questions |
Business Impact
- Support Cost Reduction: 60% of routine questions handled by AI (vs. $24/agent query)
- Response Time: 24 hours → 2 seconds
- Member Satisfaction: 4.2/5.0 rating; 87% prefer AI for quick answers
- Support Team Capacity: Agents focus on complex cases, escalations; handle 3x more nuanced issues
Lessons Learned
What Worked
- Conservative response strategy: Better to say "I'm uncertain" than hallucinate. Builds trust.
- Domain-specific evaluation: Generic chatbot metrics (BLEU, ROUGE) don't capture insurance accuracy. Used domain expert labels.
- Escalation workflow: System hands off to human agent gracefully. Members appreciate the option.
- Monitoring from day 1: Caught hallucinations early with continuous evaluation.
Key Takeaways
- Healthcare chatbots need PII handling + compliance from day 1, not as afterthought
- Conversational AI + domain expertise = better outcomes than AI alone
- Users prefer honest "I don't know" over confident hallucination
- Continuous human review (1% of queries) catches drift early
Code & Resources
GitHub Repository: github.com/AntonGlenbovitch/health-insurance-qa
Includes:
- React chat UI component
- Node.js/Express backend
- RAG pipeline setup (LangChain + Pinecone)
- PII detection and redaction
- Evaluation framework integration
- AWS Lambda deployment
- Unit and integration tests
Related Articles:
Questions About Healthcare AI?
Want to discuss healthcare chatbots, compliance requirements, or conversational AI architecture? Let's talk.