Eval request: HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive

#590
by mildsarcasm - opened

Bumping this to suggest the entire series

I currently get vllm/transformers errors like "ValueError: GGUF model with architecture qwen35 is not supported yet." when trying to run this. Since HauhauCS only uploads ggufs, I'll have to wait to test it.

Would your life be easier using llama.cpp in circumstances like that, or do deal with models that don't fit in your GPU memory? (I suppose the potential for workflow changes could be non-zero, depending on how you're calling vLLM).

yeah I'd have to change some code to get ggufs to run with llama.cpp. It'd probably also take longer to run without vllm's batching, but I could try it sometime.

If it helps I have some safetensors here with my own benchmarks

https://huggingface.co/collections/DreamFast/hauhaucs-safetensor-benchmarks

The qwen 3 and 3.5 series of models were released with BF16/FP16 GGUFs and I had wrote my own tool to reverse them into safetensor format, assuming close as we can get to the proper safetensors.

@DontPlanToEnd if you can use these to verify my own benchmarks that'd be cool! So far it seems that the claims hauhaucs makes with his models are not valid. There is clear degradation compared to the base models and also some refusals.

Also I hadn't realised how difficult benchmarking is. It had taken me a week of constant GPU crunching and many re runs to get these results.

Sign up or log in to comment