microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 17.1k • 1.43k
Read a detailed overview of the FineWeb web‑scale text dataset
The ultimate guide to training LLM on large GPU Clusters
Calculate and visualize memory usage for model training