Microsoft’s “1‑bit” AI model runs on a CPU only, while matching larger systems

arstechnica.comPublished: 4/18/2025

Summary

The quest for simpler, more efficient AI models has led Microsoft researchers to unveil a groundbreaking neural network that operates on just three weight values: -1, 0, or 1. This "ternary" architecture slashes memory demands and processing power while maintaining performance comparable to larger, full-precision models. Unlike previous attempts at quantization—reducing model size after training—the new BitNet b1.58b is natively trained at scale, avoiding performance losses and setting a new benchmark for efficient AI development.