Request for FP8 Quantization of Qworus 3.5 27B
The Qworus 3.5 27B model is very useful, but it's not very friendly to my NVIDIA 40-series GPU's 64GB VRAM. Could you please provide an FP8 version?
Thank you!
Thank you for your support!
I wasn’t able to successfully produce an FP8 version earlier due to some environment conflicts, and I’m really sorry about that. I’ll give it another try soon and aim to release the FP8 version as soon as possible.
Really hope to see a FP8 version, which is very suitable for 32G & 48G GPUs
The Qworus 3.5 27B model is very useful, but it's not very friendly to my NVIDIA 40-series GPU's 64GB VRAM. Could you please provide an FP8 version?
Thank you!
Really hope to see a FP8 version, which is very suitable for 32G & 48G GPUs
I fixed the original files today and quantized the model to FP8. I haven’t tested it yet, so I’m not sure how stable it is.