π©
MinerU is a document parsing tool that converts PDFs, Office documents, and other complex formats into clean markdown and JSON optimized for LLM consumption. Built for RAG pipelines, AI agents, and pretraining workflows, it handles layout analysis, OCR, and data extraction to make documents AI-ready.