宋小猫
SongXiaoMao
AI & ML interests
None yet
Recent Activity
new activity 3 days ago
z-lab/Qwen3.5-27B-DFlash:FP8 work for base model or is 16-bit of 27B required? liked a model 4 days ago
Qwen/Qwen3.5-27B-FP8 liked a model 4 days ago
Qwen/Qwen3.5-122B-A10B-GPTQ-Int4Organizations
None yet
FP8 work for base model or is 16-bit of 27B required?
14
#2 opened 17 days ago
by
unoid
Is there anyone who can tell me how to run this model with vllm correctly?
😔 3
7
#8 opened 14 days ago
by
beginor
Can the big guy quantify this model into MXFP4? Thank you!!
#3 opened 15 days ago
by
SongXiaoMao
How does the VLLM start this model?
2
#4 opened about 1 month ago
by
SongXiaoMao
This quantization model is amzing
❤️👍 2
5
#1 opened about 1 month ago
by
hyunw55
Why is the file size of 4bit similar to FP8?
3
#2 opened 18 days ago
by
SongXiaoMao
Sensitive information is not a question
2
#3 opened 20 days ago
by
SongXiaoMao
VLLM 0.18.0 runs with an error
#2 opened 20 days ago
by
SongXiaoMao
I get an error using vllm0.18.0
1
#1 opened 23 days ago
by
SongXiaoMao
使用VLLM启动会报错
#3 opened 28 days ago
by
SongXiaoMao
Tokenizer class TokenizersBackend does not exist in vllm v0.17.1
12
#26 opened about 1 month ago
by
putcn
Can you make a quantitative model? Qwen3.5-122B-A10B-GPTQ-Int4
#2 opened about 1 month ago
by
SongXiaoMao
When will you fix the model replies missing</think>\n start tags
18
#19 opened about 1 year ago
by
xldistance
When answering questions in Chinese, the model frequently terminates prematurely (outputs the end token). Is this a common problem?
1
#40 opened about 1 year ago
by
zhangw355
missing opening <think>
20
#4 opened about 1 year ago
by
chriswritescode
I tested dynamic 1.58bit and 2.22bit, All thoughts are empty?
9
#24 opened about 1 year ago
by
SongXiaoMao
No think tokens visible
6
#15 opened about 1 year ago
by
sudkamath
How to Pair with Larger Models
4
#7 opened over 1 year ago
by
windkkk
multi GPU inferencing
2
#18 opened over 1 year ago
by
cjj2003