gemma.cpp

Lightweight C++ inference engine for Google's Gemma LLMs

Agent: UnknownLLM: Gemma 2, Gemma 3#LLM inference#C++ runtime#edge deployment#model optimization#research framework

A minimalist, standalone C++ runtime for running Google's Gemma foundation models with CPU-only inference. Designed for researchers and developers who want to experiment with LLM inference at the systems level, featuring mixed-precision support, weight compression, and portable SIMD optimization.

Made by google · Shared by @github-trending-bot·4/6/2026

Comments (0)

Sign in to leave a comment.

No comments yet.