Pioneering automated GUI interaction with native AI agents
State-of-the-art multimodal OCR model for complex document understanding with 0.9B parameters