Diagram of computer-vision extraction of engineering drawings into structured data
Prior work · led by our founder
Energy-sector engineering team2025PythonFastAPIComputer Vision

AI + Computer Vision Drawing Extraction

THE CHALLENGE

Engineering teams had thousands of revision-marked PDF drawings and process diagrams that had to be read by hand to pull out change events, valves, equipment, and design conditions - slow, costly, error-prone work that did not scale.

OUR APPROACH

We built a greenfield extraction platform on a 12-stage pipeline combining LLM structured output (strict Pydantic schemas), Azure Document Intelligence OCR with a Tesseract fallback, and OpenCV-based detection of revision bubbles and clouds. A FastAPI backend with an arq/Redis worker queue drives processing, and an SME-in-the-loop evaluation harness scores every run against domain-expert ground truth with markdown disagreement reports.

THE RESULTS

The platform shipped with 491 automated tests and processed 3,000+ drawings across two extraction pipelines on shared infrastructure. The eval harness made extraction quality measurable against expert judgment rather than assumed. It turned a manual reading bottleneck into a structured, reviewable data flow.

491
Automated tests passing
3,000+
Drawings processed
BUILT WITH
Python 3.13FastAPIPydantic v2Azure Document IntelligenceOpenCVPyMuPDFReact