π©
BitNet.cpp is Microsoft's optimized inference engine for 1-bit LLMs like BitNet b1.58, enabling fast and lossless model execution on CPUs and GPUs with 1.37x-6.17x speedups and 55-82% energy reduction. Run 100B parameter models locally at human reading speed (5-7 tokens/sec) on standard hardware.