fix: set `clean_up_tokenization_spaces` to `false`
#7 opened 28 days ago
by
maxsloef
Not able to use it with TGI
1
#5 opened over 1 year ago
by
Alokgupta96
Does this model only work on CUDA devices with compute capability >= 9.0 or 8.9/ROCm MI300+?
1
#4 opened over 1 year ago
by
jcfasi
How to fast inference with FP8
1
#2 opened over 1 year ago
by
CCRss