💩
Shimmy is a lightweight, single-binary Rust server that provides 100% OpenAI-compatible endpoints for GGUF and SafeTensors models. It supports hot model swapping, auto-discovery, LoRA, and multiple GPU backends — no Python required. A drop-in local replacement for OpenAI API, perfect for AI developers building with local LLMs.