Spaces:

knoxel
/

bitnet-b158-cpu-explorer

Running

knoxel commited on 4 days ago

Commit

7039448

verified ·

1 Parent(s): 70d10ec

docs: add cross-link to fast bitnet.cpp version

Files changed (1) hide show

README.md CHANGED Viewed

@@ -22,6 +22,8 @@ short_description: "Chat with Microsoft's 1-bit LLM on CPU — no GPU needed"
 An interactive demo of **Microsoft Research's first open-source native 1-bit Large Language Model**.
 ## What makes this special?
 | Feature | Detail |
@@ -44,13 +46,15 @@ Since weights are only -1, 0, or +1, matrix multiplication becomes pure **additi
 - 🏗️ **Architecture** — Visual explainer of how BitNet b1.58 differs from standard Transformers
 - ⚙️ **System** — Live hardware & memory stats
-## ⚠️ Performance note
-This demo uses the `transformers` library, which does **not** include the specialized `bitnet.cpp` kernels. For the paper's reported CPU latency (29ms/token), use [bitnet.cpp](https://github.com/microsoft/BitNet) with the [GGUF weights](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf).
 ## References
-- 📄 [Technical Report](https://arxiv.org/abs/2504.12285)
 - 🤗 [Model Weights](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T)
 - 💻 [bitnet.cpp](https://github.com/microsoft/BitNet) (38K+ ⭐)
-- 📦 [GGUF Weights](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf)

 An interactive demo of **Microsoft Research's first open-source native 1-bit Large Language Model**.
+> ⚡ **Want the fast version?** See [knoxel/bitnet-cpp-explorer](https://huggingface.co/spaces/knoxel/bitnet-cpp-explorer) — same model but powered by bitnet.cpp's optimized ternary kernels (4-10× faster).
 ## What makes this special?
 | Feature | Detail |
 - 🏗️ **Architecture** — Visual explainer of how BitNet b1.58 differs from standard Transformers
 - ⚙️ **System** — Live hardware & memory stats
+## Performance note
+This demo uses the `transformers` library (~1.4 tok/s), which does **not** include the specialized bitnet.cpp kernels. For the paper's reported CPU latency (29ms/token = ~34 tok/s), see:
+- ⚡ [Fast version with bitnet.cpp](https://huggingface.co/spaces/knoxel/bitnet-cpp-explorer)
+- 💻 [bitnet.cpp repo](https://github.com/microsoft/BitNet) with the [GGUF weights](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T-gguf)
 ## References
+- 📄 [Technical Report](https://arxiv.org/abs/2504.12285) — BitNet b1.58 2B4T
+- 📄 [bitnet.cpp Paper](https://arxiv.org/abs/2502.11880) — Optimized inference kernels
 - 🤗 [Model Weights](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T)
 - 💻 [bitnet.cpp](https://github.com/microsoft/BitNet) (38K+ ⭐)