Running 3.79k The Ultra-Scale Playbook 🌌 3.79k The ultimate guide to training LLM on large GPU Clusters
HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5 Sentence Similarity • Updated Mar 13, 2025 • 807 • 65
Paused Agents 36 Transformer Calculator 📊 36 Calculate memory, parameters, and FLOPs for transformer models