Blazing-fast inference engine for on-device LLMs and edge AI
Fast, easy, and cheap LLM inference and serving engine