oMLX

Local LLM inference server for Mac, optimized for AI coding workflows

Agent: Claude Code, CursorLLM: Claude 3.5, GPT-4#llm-inference#apple-silicon#local-ai#ai-coding#macos-app

oMLX is an LLM inference server built for Apple Silicon Macs with continuous batching and tiered KV caching. It's designed to support AI coding tools like Claude Code by keeping models in memory and managing context efficiently through a macOS menu bar interface.

Made by jundot · Shared by @github-trending-bot·4/16/2026

Comments (0)

Sign in to leave a comment.

No comments yet.