Qwen3-Coder-30B-A3B-Instruct-Q4_K_M-llamafile
This model was quantized and converted to GGUF format from Qwen/Qwen3-Coder-30B-A3B-Instruct using llama.cpp via the ggml.ai's GGUF-my-repo space. The resulting GGUF was then packaged into a llamafile using llamafile v0.9.3 and the .args file present in this repository.
Refer to the original model card for more details on the model.
Refer to GGUF-my-repo for more details on the quantization (Q4_K_M) and conversion to GGUF.
Refer to llamafile for more details on the accessible llamafile format, the sandboxing and performant server it provides, and how to use it.
Quickstart
Download llamafile and ensure it is executable
wget https://huggingface.co/chrisolof/Qwen3-Coder-30B-A3B-Instruct-Q4_K_M-llamafile/resolve/main/qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile
chmod +x qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile
Run command line chat interface
./qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile
Run server
./qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile --server --v2 --listen 127.0.0.1:8011
See available v2 (LLaMAfiler) server flags via ./qwen3-coder-30b-a3b-instruct-q4_k_m.llamafile --server --v2 --help and see LLaMAfiler Documentation.
- Downloads last month
- 42
Model tree for chrisolof/Qwen3-Coder-30B-A3B-Instruct-Q4_K_M-llamafile
Base model
Qwen/Qwen3-Coder-30B-A3B-Instruct