InstantRecon
Context
UVeye's inspection technology was established in fixed ops — but the company's growth strategy required breaking into variable ops, where trade-in appraisals and reconditioning decisions happen. InstantRecon was the entry point: an LLM-powered reconditioning estimator that converted UVeye's machine-detected vehicle damages into cost estimates, giving variable ops teams a reason to adopt UVeye. I owned the product end-to-end, from vision through delivery.
Challenge
Sales managers at dealerships estimated reconditioning costs manually during trade-in appraisals — a process riddled with missed damages and inconsistent pricing that eroded margins. UVeye's inspection system already detected and catalogued damages from machine images, but converting raw detections into dollar estimates that managers would actually trust was the core AI challenge. The LLM had to produce accurate, explainable cost estimates from unstructured damage descriptions — and the workflow had to fit seamlessly into how managers already worked, or they'd ignore it entirely.
Approach
I began by co-defining the initial LLM prompt with engineers to generate per-damage cost estimates alongside explanatory comments. Early outputs were unreliable — I diagnosed two root causes: the wrong model and insufficient domain context. After switching the underlying model and injecting a structured price reference table mapping car parts, damage types, and severity levels to repair costs, accuracy improved dramatically. I also embedded guardrails directly into the prompt — price range constraints for specific repair categories — to keep estimates within realistic bounds. To validate quality, I ran iterative feedback sessions with real sales managers at dealerships, tightening the prompt and reference data based on their corrections. We rolled out in three phases — internal dogfooding, a closed beta with selected dealerships, then general availability. In parallel, I designed the full end-to-end workflow: AI-generated estimates, manager review and editing, branded customer-facing quotes, and delivery via email and SMS.
Key Decisions
Switched the LLM model after poor initial results, prioritizing estimate accuracy.
The first model the engineering team selected produced unreliable outputs. Rather than over-engineering the prompt to compensate, I pushed to swap the model itself — the replacement delivered noticeably better estimates with the same prompt and context.
Designed manager estimate adjustments as an implicit production eval loop.
When managers edited the AI's estimates up or down, the system logged every adjustment as a continuous signal on model accuracy — without requiring explicit labeling or a separate feedback UI. This turned the natural review workflow into a scalable evaluation mechanism at zero additional effort.
Used a curated generic price table rather than letting dealers upload their own.
Dealers wanted custom price tables, but supporting per-dealer uploads would have added significant complexity to both the product and the data pipeline. A well-researched generic table got us to accurate-enough estimates at launch, without the maintenance burden of dealer-specific data management.
Impact
- $200–$300 average reconditioning savings per trade-in
- $1,200 average holding cost reduction per vehicle
- Full end-to-end workflow from AI damage detection to customer-ready quote delivery
- Implicit production eval loop via manager adjustments, enabling continuous accuracy improvement without manual labeling
Reflection
The price table was the single biggest unlock — injecting structured domain context into the prompt turned unreliable estimates into ones managers could actually work with. It reinforced that for domain-specific LLM products, the quality of your reference data matters more than prompt sophistication. Looking back, I'd invest earlier in building a golden dataset for offline evaluation alongside the production feedback loop, to measure accuracy improvements more systematically between iterations.