RAG Extraction Pipeline for a Power Utility

THE CHALLENGE

A major power utility needed equipment-qualification data extracted from thousands of dense engineering documents at high accuracy. The existing manual, spreadsheet-driven process was slow and silently let errors through, unacceptable in a safety-critical engineering context.

OUR APPROACH

We built a retrieval-augmented Azure OpenAI extraction pipeline paired with a deterministic verification tool, and made evaluation a first-class part of delivery. A DeepEval suite combined deterministic scoring with LLM-judged faithfulness and hallucination checks, gated in CI so the pipeline fails on any confidence regression before it can ship. A companion verification pass cross-checks every extracted value against source data.

THE RESULTS

Extraction accuracy rose from ~78% to ~95%+ across more than 20,000 records. The verification layer caught roughly 40 incorrect records that the previous tooling had missed, and automation removed about 30 hours of manual work per month. It anchored a $500K AI initiative and became the template for evaluation-gated AI delivery.

~78%→95%+

Extraction accuracy

20,000+

Records processed

$500K

AI initiative led

BUILT WITH

PythonAzure OpenAIAzure Document IntelligenceDeepEvalMSSQLAzure DevOps CI

Previous Project

Agentic AI Assistant for Engineering BOMs

Next Project

AI + Computer Vision Drawing Extraction

Back to All Work