Please move metal/ and original/ folder in a separate branch

#160
by mratsim - opened

They are tripling the size needed to be downloaded.

I don't think there is a single use-case where you want to infer on the main mxfp4 weights + metal + original

literally this!!

Man I thought i had to download ~65gb to run this model, I used nvidia playbook guides..
it downloads the mess and then some:

All the safetensors files -- total 65GB
Apple metal bin file --- total 65GB
some safetensor bla la -- 10GB 6+ files... I don't what is going on and this was nvidia official instructions of TensorRT-LLM so so stupid .. should have used llama.cpp

Sign up or log in to comment