Jackrong/Qwopus3.5-27B-v3 · Request for FP8 Quantization of Qworus 3.5 27B

Request for FP8 Quantization of Qworus 3.5 27B

#12

by mrwd2005 - opened 16 days ago

mrwd2005

The Qworus 3.5 27B model is very useful, but it's not very friendly to my NVIDIA 40-series GPU's 64GB VRAM. Could you please provide an FP8 version?
Thank you!

Jackrong

Owner 16 days ago

Thank you for your support!

I wasn’t able to successfully produce an FP8 version earlier due to some environment conflicts, and I’m really sorry about that. I’ll give it another try soon and aim to release the FP8 version as soon as possible.

sczhengyabin

16 days ago

Really hope to see a FP8 version, which is very suitable for 32G & 48G GPUs

Jackrong

Owner 16 days ago

The Qworus 3.5 27B model is very useful, but it's not very friendly to my NVIDIA 40-series GPU's 64GB VRAM. Could you please provide an FP8 version?
Thank you!

Really hope to see a FP8 version, which is very suitable for 32G & 48G GPUs

I fixed the original files today and quantized the model to FP8. I haven’t tested it yet, so I’m not sure how stable it is.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment