Anton Glenbovitch
Senior AI Engineer • LLM Systems • RAG • AWS Architecture
I design and deploy production-grade AI systems using LLMs and retrieval architectures. Focus on reliability, performance, cost optimization, and real-world deployment.
20+ Years Building Enterprise Systems
Backend architecture, data systems, and AI applications. Experience at Yale University, Health Net, and independent consulting.
Core Expertise
- LLM Systems: Production RAG, prompt engineering, agent orchestration
- Retrieval Architecture: Vector databases, hybrid search, evaluation frameworks
- AWS Deployment: Lambda, Bedrock, OpenSearch, cost optimization
- System Design: Reliability, observability, guardrails against hallucination
Engineering Philosophy
In production AI systems, the main challenges are not model selection, but:
- Data quality and preparation
- Retrieval accuracy and ranking
- System design and integration
- Evaluation and continuous monitoring
Reliable systems require guardrails, evaluation metrics, and rigorous testing to control hallucinations and maintain consistency at scale.
Featured Projects
Production-grade AI systems showcasing RAG architecture, retrieval optimization, and scalable deployment.
Enterprise Claim AI Platform
Overview: Production AI system for analyzing insurance claims using retrieval-augmented generation (RAG). Processes structured and unstructured claim data, extracts key information, and generates summaries with 94% accuracy.
Key Technical Decisions
- Hybrid Retrieval: Combined semantic search + BM25 ranking for 8% accuracy improvement over semantic-only approach
- Evaluation Framework: Automated metrics (ROUGE, BERTScore) + human QA labels for ground truth validation
- Cost Optimization: Prompt caching (30% reduction), batch processing, cheaper embedding models
- Guardrails: Hallucination detection, conservative "I don't know" responses for out-of-domain queries
Impact
- Reduced claim analysis time: 2 hours → 10 minutes per claim
- Cost savings: $1.2M annually (processing cost reduction)
- Team capacity: Handle 3× more claims with same headcount
- Quality: 94% accuracy on hold claim classifications
RAG Evaluation Framework
Overview: Comprehensive evaluation framework for assessing RAG pipeline quality. Combines automated metrics with human-in-the-loop validation to measure retrieval accuracy, generation quality, and hallucination rates.
Health Insurance Member Q&A Chatbot
Overview: Full-stack conversational AI system helping health insurance members answer questions about coverage, claims, benefits. Combines React frontend, Node.js backend, and RAG pipeline for accurate, compliant responses.
Recent Articles
Technical deep dives on RAG, LLM systems, and production AI architecture.
Building Production RAG: Cost Optimization Strategies
How to reduce RAG pipeline costs by 66% without sacrificing quality. Covers prompt caching, embedding model selection, batch processing, and cost-per-query optimization.
Read on Medium →Evaluating RAG Systems: Beyond Automated Metrics
Why automated metrics alone fail for RAG evaluation. The case for human-in-the-loop validation, building labeled datasets, and continuous monitoring in production.
Read on Dev.to →Conversational AI in Healthcare: Domain-Specific Challenges
Lessons from building health insurance chatbots. PII handling, regulatory compliance, conservative response strategies, and maintaining accuracy in regulated domains.
Read on Medium →Architecture Lessons from 20 Years: Systems Thinking at Scale
Evolutionary lessons building systems from monoliths to microservices to serverless. Why governance matters, cost is architecture, and observability is a first-class concern.
Read on Dev.to →About
Experience
20+ years building enterprise systems across backend platforms, data systems, and AI applications.
- ITA Consulting (2025–present): Strategic consultant on technology & operations for healthcare, manufacturing, defense
- Yale University (2013–2025): Led architecture decisions and enterprise transformation initiatives serving 15K+ users
- Health Net (1999–2010): Architected enterprise backend systems for claims processing, member/provider workflows
Core Strengths
- Technical: 20 years systems architecture, Java/Python, AWS, databases, microservices
- AI/LLM: 1+ year production RAG, LangChain, vector DBs, evaluation frameworks
- Domain: 10 years health insurance (claims, compliance, enterprise integration)
- Leadership: Cross-functional team leadership, stakeholder communication, technical strategy
Let's Work Together
Interested in building production AI systems? Have a RAG or LLM architecture question? Let's talk.
Email: a.glenbovitch@gmail.com
Phone: (203) 540-7348