Quantized version of black-forest-labs/FLUX.2-klein-9B made using sd.cpp.
Brain-dead mixed tensor selection of quantization based on the amount of element of the associated tensor:
Finding the distribution of layer profile:
from safetensors.torch import load_file
k = load_file("flux-2-klein-9b.safetensors")
keys = list(k.keys())
distrib = []
for key in keys:
n = k[key].nelement()
if n not in distrib:
distrib.append(n)
distrib.sort()
print(distrib)
Making the CLI parameter for sd-cli.exe:
repart = {}
repart["524288"] = "q4_K"
repart["1048576"] = "q4_K"
repart["16777216"] = "q3_K"
repart["33554432"] = "q3_K"
repart["50331648"] = "q3_K"
repart["100663296"] = "q2_K"
repart["150994944"] = "q2_K"
tensor_type_rules = []
for key in keys:
n = str(k[key].nelement())
if n in repart.keys():
tensor_type_rules.append(f"{key}={repart[n]}")
print(",".join(tensor_type_rules))
Resulting command:
sd-cli.exe --mode convert --model "flux-2-klein-9b.safetensors" --output "flux-2-klein-9b-Q3_K_M.gguf" --tensor-type-rules "double_blocks.0.img_attn.proj.weight=q3_K,double_blocks.0.img_attn.qkv.weight=q3_K,double_blocks.0.img_mlp.0.weight=q2_K,double_blocks.0.img_mlp.2.weight=q3_K,double_blocks.0.txt_attn.proj.weight=q3_K,double_blocks.0.txt_attn.qkv.weight=q3_K,double_blocks.0.txt_mlp.0.weight=q2_K,double_blocks.0.txt_mlp.2.weight=q3_K,double_blocks.1.img_attn.proj.weight=q3_K,double_blocks.1.img_attn.qkv.weight=q3_K,double_blocks.1.img_mlp.0.weight=q2_K,double_blocks.1.img_mlp.2.weight=q3_K,double_blocks.1.txt_attn.proj.weight=q3_K,double_blocks.1.txt_attn.qkv.weight=q3_K,double_blocks.1.txt_mlp.0.weight=q2_K,double_blocks.1.txt_mlp.2.weight=q3_K,double_blocks.2.img_attn.proj.weight=q3_K,double_blocks.2.img_attn.qkv.weight=q3_K,double_blocks.2.img_mlp.0.weight=q2_K,double_blocks.2.img_mlp.2.weight=q3_K,double_blocks.2.txt_attn.proj.weight=q3_K,double_blocks.2.txt_attn.qkv.weight=q3_K,double_blocks.2.txt_mlp.0.weight=q2_K,double_blocks.2.txt_mlp.2.weight=q3_K,double_blocks.3.img_attn.proj.weight=q3_K,double_blocks.3.img_attn.qkv.weight=q3_K,double_blocks.3.img_mlp.0.weight=q2_K,double_blocks.3.img_mlp.2.weight=q3_K,double_blocks.3.txt_attn.proj.weight=q3_K,double_blocks.3.txt_attn.qkv.weight=q3_K,double_blocks.3.txt_mlp.0.weight=q2_K,double_blocks.3.txt_mlp.2.weight=q3_K,double_blocks.4.img_attn.proj.weight=q3_K,double_blocks.4.img_attn.qkv.weight=q3_K,double_blocks.4.img_mlp.0.weight=q2_K,double_blocks.4.img_mlp.2.weight=q3_K,double_blocks.4.txt_attn.proj.weight=q3_K,double_blocks.4.txt_attn.qkv.weight=q3_K,double_blocks.4.txt_mlp.0.weight=q2_K,double_blocks.4.txt_mlp.2.weight=q3_K,double_blocks.5.img_attn.proj.weight=q3_K,double_blocks.5.img_attn.qkv.weight=q3_K,double_blocks.5.img_mlp.0.weight=q2_K,double_blocks.5.img_mlp.2.weight=q3_K,double_blocks.5.txt_attn.proj.weight=q3_K,double_blocks.5.txt_attn.qkv.weight=q3_K,double_blocks.5.txt_mlp.0.weight=q2_K,double_blocks.5.txt_mlp.2.weight=q3_K,double_blocks.6.img_attn.proj.weight=q3_K,double_blocks.6.img_attn.qkv.weight=q3_K,double_blocks.6.img_mlp.0.weight=q2_K,double_blocks.6.img_mlp.2.weight=q3_K,double_blocks.6.txt_attn.proj.weight=q3_K,double_blocks.6.txt_attn.qkv.weight=q3_K,double_blocks.6.txt_mlp.0.weight=q2_K,double_blocks.6.txt_mlp.2.weight=q3_K,double_blocks.7.img_attn.proj.weight=q3_K,double_blocks.7.img_attn.qkv.weight=q3_K,double_blocks.7.img_mlp.0.weight=q2_K,double_blocks.7.img_mlp.2.weight=q3_K,double_blocks.7.txt_attn.proj.weight=q3_K,double_blocks.7.txt_attn.qkv.weight=q3_K,double_blocks.7.txt_mlp.0.weight=q2_K,double_blocks.7.txt_mlp.2.weight=q3_K,double_stream_modulation_img.lin.weight=q2_K,double_stream_modulation_txt.lin.weight=q2_K,final_layer.adaLN_modulation.1.weight=q3_K,final_layer.linear.weight=q4_K,img_in.weight=q4_K,single_blocks.0.linear1.weight=q2_K,single_blocks.1.linear1.weight=q2_K,single_blocks.10.linear1.weight=q2_K,single_blocks.11.linear1.weight=q2_K,single_blocks.12.linear1.weight=q2_K,single_blocks.13.linear1.weight=q2_K,single_blocks.14.linear1.weight=q2_K,single_blocks.15.linear1.weight=q2_K,single_blocks.16.linear1.weight=q2_K,single_blocks.17.linear1.weight=q2_K,single_blocks.18.linear1.weight=q2_K,single_blocks.19.linear1.weight=q2_K,single_blocks.2.linear1.weight=q2_K,single_blocks.20.linear1.weight=q2_K,single_blocks.21.linear1.weight=q2_K,single_blocks.22.linear1.weight=q2_K,single_blocks.23.linear1.weight=q2_K,single_blocks.3.linear1.weight=q2_K,single_blocks.4.linear1.weight=q2_K,single_blocks.5.linear1.weight=q2_K,single_blocks.6.linear1.weight=q2_K,single_blocks.7.linear1.weight=q2_K,single_blocks.8.linear1.weight=q2_K,single_blocks.9.linear1.weight=q2_K,single_stream_modulation.lin.weight=q3_K,time_in.in_layer.weight=q4_K,time_in.out_layer.weight=q3_K,txt_in.weight=q3_K"
- Downloads last month
- 44
Hardware compatibility
Log In to add your hardware
3-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for n-Arno/FLUX.2-klein-base-9B-GGUF
Base model
black-forest-labs/FLUX.2-klein-9B