Do NOT use CUDA 13.2
pinned#4 opened 3 days ago
by
danielhanchen
Highest performance inference on <8 RTX 6000 Pros setups
#6 opened 2 days ago
by
curiouspp8
Are these the final GGUFs are you working on revisions
1
#5 opened 3 days ago
by
spanspek
IQ4_NL Gibberish in llama.cpp
12
#3 opened 4 days ago
by
jpsequeira
Speed inference UD-IQ2_M
🤯 1
1
#2 opened 4 days ago
by
Ukro
Any possibilty to Re-Quantize GLM-5 quants?
👀 1
2
#1 opened 4 days ago
by
elpirater312