← Back to Projects

Health Insurance Member Q&A Chatbot

Full-stack conversational AI • React + Node.js + RAG • Healthcare domain • AWS deployment

Overview

A full-stack conversational AI system helping health insurance members answer questions about coverage, claims, benefits, and enrollment. Combines a React frontend, Node.js backend, and RAG pipeline to deliver accurate, compliant responses in real-time.

Problem Statement

  • Support Volume: 50K+ member questions monthly; current support team is overloaded
  • Response Time: Average wait time: 24 hours. Members need instant answers.
  • Cost: Support agent fully-loaded cost: $75K/year per agent. Need to reduce per-query cost.
  • Compliance: Healthcare domain requires PII handling, regulatory accuracy, clear disclaimers
  • Consistency: Agents give different answers to same question. System must be consistent.

Solution: Conversational AI on AWS

  • Frontend: React app with real-time chat UI, typing indicators, message history
  • Backend: Node.js + Express, REST API, WebSocket support for real-time updates
  • AI Layer: RAG pipeline (LangChain + Pinecone + GPT-4) for member question answering
  • Infrastructure: AWS Lambda, API Gateway, DynamoDB for scalability

Architecture

System Design

┌─────────────────┐
│   React Frontend│ (Chat UI, message history, typing indicators)
└────────┬────────┘
         │ WebSocket
         ▼
┌─────────────────────────────────────┐
│  Node.js/Express Backend API        │
│ (Request validation, auth, logging) │
└────────┬────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────┐
│  RAG Pipeline (LangChain)           │
│  1. Vector retrieval (Pinecone)     │
│  2. Generate response (GPT-4)       │
│  3. Guardrails (PII detection)      │
└────────┬────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────┐
│  AWS Infrastructure                 │
│  • Lambda (compute)                 │
│  • DynamoDB (conversation history)  │
│  • CloudWatch (logging/monitoring)  │
└─────────────────────────────────────┘

Tech Stack

React 18 TypeScript Node.js Express LangChain Pinecone GPT-4 AWS Lambda DynamoDB WebSocket

Key Technical Decisions

1. Full-Stack Implementation (React + Node.js)

Why: Building chatbot UI from scratch saves $100K+ in third-party services (Intercom, Drift, etc.). Node.js TypeScript provides type safety and code sharing between frontend and backend.

2. WebSocket for Real-Time Responses

Why: HTTP polling causes delays and high server load. WebSocket enables streaming responses (show text as it's generated), improving perceived performance.

3. RAG for Domain-Specific Knowledge

Why: GPT-4 alone hallucinates about plan benefits. RAG retrieves plan documents, ensuring accuracy. Reduces hallucination from 15% to <2%.

4. PII Detection & Redaction

Why: Members ask about personal claims data. System detects PII (member ID, SSN, dates) and redacts from responses per HIPAA.

5. Guardrails: Conservative Response Strategy

Why: In healthcare, saying "I don't know" is better than guessing. System flags uncertain responses and offers human agent escalation.

Key Features

For Members

  • 24/7 Availability: Instant answers anytime, no wait times
  • Clear Answers: Plain English explanations of coverage, benefits, claims
  • Escalation: Unclear answers are escalated to human agent with context
  • Privacy: PII is handled securely; responses don't leak personal data
  • Conversation History: Members can revisit past questions

For Operations

  • Reduced Support Load: System handles 60% of routine questions (coverage, claims, enrollment)
  • Cost Reduction: $0.04/query vs. $24 for agent handling (~600x cheaper)
  • Escalation Insights: Track which topics confuse members; update plan docs
  • Compliance Audit Trail: All queries logged for regulatory review
  • Monitoring Dashboard: Accuracy, satisfaction, response time metrics

Results & Impact

Quantified Metrics

Metric Value
Response Accuracy 92% (on gold standard test set)
User Satisfaction 4.2/5.0 (200+ user ratings)
Hallucination Rate <2% (continuous monitoring)
Query Cost $0.04 (vs. $24 agent cost)
Response Time <2 seconds
Questions Handled 60% of routine questions

Business Impact

  • Support Cost Reduction: 60% of routine questions handled by AI (vs. $24/agent query)
  • Response Time: 24 hours → 2 seconds
  • Member Satisfaction: 4.2/5.0 rating; 87% prefer AI for quick answers
  • Support Team Capacity: Agents focus on complex cases, escalations; handle 3x more nuanced issues

Lessons Learned

What Worked

  • Conservative response strategy: Better to say "I'm uncertain" than hallucinate. Builds trust.
  • Domain-specific evaluation: Generic chatbot metrics (BLEU, ROUGE) don't capture insurance accuracy. Used domain expert labels.
  • Escalation workflow: System hands off to human agent gracefully. Members appreciate the option.
  • Monitoring from day 1: Caught hallucinations early with continuous evaluation.

Key Takeaways

  1. Healthcare chatbots need PII handling + compliance from day 1, not as afterthought
  2. Conversational AI + domain expertise = better outcomes than AI alone
  3. Users prefer honest "I don't know" over confident hallucination
  4. Continuous human review (1% of queries) catches drift early

Code & Resources

GitHub Repository: github.com/AntonGlenbovitch/health-insurance-qa

Includes:

  • React chat UI component
  • Node.js/Express backend
  • RAG pipeline setup (LangChain + Pinecone)
  • PII detection and redaction
  • Evaluation framework integration
  • AWS Lambda deployment
  • Unit and integration tests

Related Articles:

Questions About Healthcare AI?

Want to discuss healthcare chatbots, compliance requirements, or conversational AI architecture? Let's talk.

Email: a.glenbovitch@gmail.com