Ocr

4 articles about ocr.

Your Agent Watches Video Wrong - Keyframe Extraction vs Frame-by-Frame

March 18, 2026·2 min read

Frame-by-frame video analysis is wasteful. Keyframe extraction with OCR on key moments gives agents 90% of the information at 5% of the cost.

video-analysiskeyframesocrai-agentscomputer-vision

LLM-Based OCR Is Significantly Outperforming Traditional ML-Based OCR

March 18, 2026·2 min read

LLM vision models combined with accessibility APIs are beating traditional OCR for screen reading. The combo of structured data plus visual understanding

ocrllm-visionaccessibility-apiscreen-readingai

Accessibility APIs vs OCR - Two Approaches to Desktop Agent Vision

March 17, 2026·2 min read

Desktop agents need to see and understand what is on screen. Accessibility APIs give you the UI tree directly while OCR reads pixels. Each approach has real

accessibility-apiocrdesktop-agentvisionautomationdesktopagents

Choosing Native Accessibility APIs Over OCR - The Decision Everyone Said Was Wrong

March 17, 2026·2 min read

When building a desktop automation project, choosing native accessibility APIs over screenshot-plus-OCR seemed wrong to everyone. It turned out to be the

accessibility-apiocrdesktop-automationtechnical-decisionsnative-apis

Ocr

Your Agent Watches Video Wrong - Keyframe Extraction vs Frame-by-Frame

LLM-Based OCR Is Significantly Outperforming Traditional ML-Based OCR

Accessibility APIs vs OCR - Two Approaches to Desktop Agent Vision

Choosing Native Accessibility APIs Over OCR - The Decision Everyone Said Was Wrong

Browse by Topic