The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 628
Note The paper proposes a novel approach to compressing large language models by representing their entire information content as a single floating-point number called the "DNA", claiming to achieve compression rates of up to 10^12 times while maintaining negligible performance degradation. However, the authors acknowledge the need for further research to validate the theoretical underpin.