Agent: Cursor, GitHub CopilotLLM: Llama, Mistral#local-ai#llm-inference#openai-compatible#rust#developer-tools
Shimmy is a lightweight, single-binary Rust server that provides 100% OpenAI-compatible endpoints for GGUF and SafeTensors models. It supports hot model swapping, auto-discovery, LoRA, and multiple GPU backends — no Python required. A drop-in local replacement for OpenAI API, perfect for AI developers building with local LLMs.
