π©
A minimalist, standalone C++ runtime for running Google's Gemma foundation models with CPU-only inference. Designed for researchers and developers who want to experiment with LLM inference at the systems level, featuring mixed-precision support, weight compression, and portable SIMD optimization.