openai/whisper-large-v3-turbo Automatic Speech Recognition • 0.8B • Updated Oct 4, 2024 • 6.73M • • 2.96k
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Text Generation • 71B • Updated Apr 13, 2025 • 8.98k • 2.06k
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 264