About Me
I’m a senior AI leader and applied researcher with 16+ years building production-grade NLP systems at scale. My focus is information retrieval (IR) and retrieval-augmented generation (RAG). At Capital One, I lead NLP/IR areas of agent-assist, which includes dynamic hybrid retrieval (BM25 + dense + SPLADE), cross-encoder re-ranking (monoT5/monoBERT/CE), ranked fusion, and user preference re-ranking (semantic-entropy/nucleus thresholds) that assist more than 21K+ agents across multiple lines of business.
Previously at Philips Research, I directed cross-functional teams delivering healthcare AI from lab to product, transferring 10+ technologies into production. My research portfolio includes 15+ granted patents, 35+ filings, and 30+ peer-reviewed publications spanning robust QA, multimodal learning (VQA-Med), and knowledge-graph–driven inference.
I operate end-to-end: problem framing → data/ETL → modeling → evaluation → rollout. Tooling includes PyTorch, Hugging Face (Transformers/PEFT: LoRA/QLoRA), Sentence-Transformers, Faiss/Milvus/pgvector, OpenSearch/Elasticsearch, PyKEEN/Neo4j, and modern serving stacks (vLLM, TGI, Triton, TensorRT-LLM) with mixed precision, quantization (8/4-bit), FlashAttention, and KV-cache optimizations. On the platform side: MLflow/W&B, Airflow, Docker/Kubernetes, Ray/Dask, and production observability (Prometheus/Grafana/ELK) with quality SLOs, drift monitors, and A/B harnesses.
My current interests center on trustworthy LLMs—meaning-aware uncertainty (e.g., semantic entropy/MARS-style scoring), confidence calibration, hallucination mitigation, and provenance tracing that distinguishes pretraining vs. context-driven knowledge in RAG. The throughline in my work is turning cutting-edge research into reliable, interpretable, cost-efficient systems that create measurable business and user impact.
Education
Ph.D., Computer Science, University of Memphis, TN — 2009–2014
M.S., Computer Science, University of Memphis, TN — 2006–2008
B.Tech., Electronics & Communication Engg., JNTU, India — 2001–2005
Core Strengths
- IR/RAG Architecture: Hybrid retrieval (BM25 + dense + colbert), cross‑encoder re‑ranking, late‑fusion, user‑preference boosting, entropy/nucleus thresholding, dynamic index routing, document chunking & windowing, query rewriting & intent routing.
- Trust & Safety for LLMs: Abstention, refusal sensitivity vs. utility trade‑offs, semantic entropy, MARS‑style meaning‑aware scoring, DetectGPT‑style detectors, confidence calibration, provenance tracing for pretrain vs. context attribution.
- Scalable ML Delivery: Model/product roadmaps, experiment platforms, evaluation harnesses, guardrails, CI/CD for models, observability (drift, latency, quality), on‑call runbooks.
- Leadership: Hiring/mentoring, stakeholder alignment, value stream ownership (~$2M), program management across research → product.
Technical Stack
Languages: Python, SQL, Java, JavaScript, Bash, C, MATLAB DL/ML: PyTorch, TensorFlow, JAX (intro), Hugging Face Transformers/PEFT, Sentence‑Transformers, scikit‑learn, XGBoost LLMs/Serving: Llama‑2/3, Mistral, GPT‑4/4o/4.1, T5/Flan, vLLM, Text Generation Inference (TGI), TensorRT‑LLM, Triton Inference Server Fine‑Tuning: LoRA/QLoRA, Adapters/IA3, P‑tuning, Prefix/Prompt/Instruction tuning; mixed‑precision, gradient/ZeRO sharding Optimization: Quantization (8‑/4‑bit), FlashAttention, speculative decoding, KV‑cache management RAG & IR: Faiss, Milvus, pgvector, Elasticsearch/OpenSearch, ColBERT; monoT5/monoBERT/CE‑rerankers; query rewriting, hyde/self‑ask, long‑context chunking Graphs: PyKEEN (RotatE/QuatE/PairRE/HousE), NetworkX, Neo4j MLOps/Platforms: MLflow, Weights & Biases, Kubeflow, Airflow, Docker, Kubernetes, Ray/Dask, GitHub Actions/CI Data/ETL: Pandas, NumPy, Apache Arrow, Spark, Kafka, Hadoop, RabbitMQ Monitoring: Prometheus, Grafana, ELK; human‑in‑the‑loop annotation (Label Studio, Prodigy) Cloud/Infra: AWS (EC2, EKS, S3, SageMaker), Azure, GCP; serverless patterns
Selected Impact & Achievements
- Capital One Agent‑Assist (IR/RAG): Led the IR team to a state‑of‑the‑art RAG platform for call‑center agents; deployed cross‑encoder re‑rankers, hybrid retrieval, and ranked‑fusion with entropy‑gated answerability, improving top‑k precision and reducing handle‑time (HT) and AHT variance.
- Trustworthy Responses: Shipped abstention & uncertainty pipelines that reduced unsafe generations while maintaining task utility; introduced user‑preference re‑ranking and dynamic profile routing.
- Tech Transfer (Philips): Drove 10+ research‑to‑product transfers (clinical de‑identification, knowledge‑graph‑assisted diagnosis, DSP assets, ICON semantic search).
- IP & Publications: 15 granted patents, 35+ filed, 64+ invention disclosures; 30+ publications (NAACL, COLING, AAAI, WWW, BHI, MLHC, TREC).
- Awards: Circle of Excellence (2025)—Capital One’s highest honor; CIO Elite (2023); TechX (2023).