Ocr
4 articles about ocr.
Your Agent Watches Video Wrong - Keyframe Extraction vs Frame-by-Frame
·2 min read
Frame-by-frame video analysis is wasteful. Keyframe extraction with OCR on key moments gives agents 90% of the information at 5% of the cost.
video-analysiskeyframesocrai-agentscomputer-vision
LLM-Based OCR Is Significantly Outperforming Traditional ML-Based OCR
·2 min read
LLM vision models combined with accessibility APIs are beating traditional OCR for screen reading. The combo of structured data plus visual understanding
ocrllm-visionaccessibility-apiscreen-readingai
Accessibility APIs vs OCR - Two Approaches to Desktop Agent Vision
·2 min read
Desktop agents need to see and understand what is on screen. Accessibility APIs give you the UI tree directly while OCR reads pixels. Each approach has real
accessibility-apiocrdesktop-agentvisionautomationdesktopagents
Choosing Native Accessibility APIs Over OCR - The Decision Everyone Said Was Wrong
·2 min read
When building a desktop automation project, choosing native accessibility APIs over screenshot-plus-OCR seemed wrong to everyone. It turned out to be the
accessibility-apiocrdesktop-automationtechnical-decisionsnative-apis
Browse by Topic
Ai Agents (346)Automation (240)Productivity (203)Macos (192)Ai Agent (182)Claude Code (163)Desktop Agent (120)Open Source (106)Developer Tools (104)April 2026 (86)Reliability (83)Accessibility Api (79)Mcp (78)Parallel Agents (75)Desktop Automation (68)Multi Agent (64)Claude (56)Ai Coding (56)Security (54)Llm (51)