Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Inference Optimization
community
Activity Feed
Follow
20
AI & ML interests
None defined yet.
Recent Activity
ChibuUkachi
updated
a model
about 9 hours ago
inference-optimization/MiniMax-M2.5.w4a16
ChibuUkachi
published
a model
about 9 hours ago
inference-optimization/MiniMax-M2.5.w4a16
dsikka
updated
a model
about 17 hours ago
inference-optimization/llama3-8b-sharegpt-5k-gen
View all activity
Team members
15
inference-optimization
's models
230
Sort: Recently updated
inference-optimization/Llama-3.1-8B-Instruct_5.5_bits_mode_noise
6B
•
Updated
Mar 12
•
12
inference-optimization/Llama-3.1-8B-Instruct_5.5_bits_mode_hybrid
6B
•
Updated
Mar 12
•
19
inference-optimization/Llama-3.1-8B-Instruct_5_bits_mode_heuristic
6B
•
Updated
Mar 12
•
9
inference-optimization/Llama-3.1-8B-Instruct_5_bits_mode_noise
6B
•
Updated
Mar 12
•
15
inference-optimization/Llama-3.1-8B-Instruct_5_bits_mode_hybrid
6B
•
Updated
Mar 12
•
10
inference-optimization/sarvam-105b-FP8-Dynamic
Text Generation
•
106B
•
Updated
Mar 9
•
4
inference-optimization/sarvam-30b-FP8-Dynamic
Text Generation
•
32B
•
Updated
Mar 9
•
39
•
1
inference-optimization/sarvam-30b-NVFP4
Text Generation
•
19B
•
Updated
Mar 9
•
23
•
1
inference-optimization/sarvam-105b-NVFP4
61B
•
Updated
Mar 9
•
3
•
1
inference-optimization/Qwen3.5-35B-A3B-FP8-Dynamic
35B
•
Updated
Mar 6
•
16
inference-optimization/Kimi-K2-Instruct-0905-BF16-FP8-BLOCK
Text Generation
•
1T
•
Updated
Mar 6
•
2
inference-optimization/MiniMax-M2.5-BF16
Text Generation
•
229B
•
Updated
Mar 6
•
108
inference-optimization/gpt-oss-20b-FP8-Dynamic
21B
•
Updated
Mar 5
•
8
inference-optimization/test_qwen3_next_mtp
Updated
Mar 4
•
2
inference-optimization/test_tencentbac_fastmtp
Updated
Mar 4
•
1
inference-optimization/Qwen3-30B-A3B-Instruct-2507-NVFP4
17B
•
Updated
Mar 4
•
30
inference-optimization/Qwen3-30B-A3B-Instruct-2507-FP8-Dynamic
31B
•
Updated
Mar 4
•
26
inference-optimization/Qwen3-30B-A3B-Instruct-2507-FP8-Block
31B
•
Updated
Mar 4
•
3
inference-optimization/Qwen3-Coder-Next.w4a16-old
Text Generation
•
12B
•
Updated
Feb 26
•
6
inference-optimization/Kimi-K2-Instruct-0905-BF16-NVFP4
Updated
Feb 24
•
1
inference-optimization/Ministral-3-14B-Instruct-2512-NVFP4
Updated
Feb 4
•
13
inference-optimization/Ministral-3-14B-Instruct-2512.w8a8
Updated
Feb 4
inference-optimization/Ministral-3-14B-Instruct-2512.w4a16
Updated
Feb 3
•
5
inference-optimization/Meta-Llama-3-8B-Instruct-NVFP4-GPTQ-Quant
5B
•
Updated
Jan 29
•
1
inference-optimization/Meta-Llama-3-8B-Instruct-NVFP4-GPTQ-MSE
5B
•
Updated
Jan 29
inference-optimization/DeepSeek-V3-debug-multiply-FP8_DYNAMIC
1B
•
Updated
Jan 24
•
1
inference-optimization/DeepSeek-V3-debug-add-FP8_DYNAMIC
1B
•
Updated
Jan 24
•
1
inference-optimization/DeepSeek-V3-debug-empty-FP8_DYNAMIC
1B
•
Updated
Jan 23
•
2.95k
inference-optimization/DeepSeek-V3-debug-multiply-NVFP4A16
0.9B
•
Updated
Jan 23
inference-optimization/DeepSeek-V3-debug-add-NVFP4A16
0.9B
•
Updated
Jan 23
•
5
Previous
1
...
4
5
6
7
8
Next