Exllama v2 Quantizations of WeirdCompound-v1.2-24b

Using turboderp's ExLlamaV2 v0.3.2 for quantization.

The "main" branch only contains the measurement.json, download one of the other branches for the model (see below)

Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.

Original model: FlareRebellion/WeirdCompound-v1.2-24b

Available Quantization sizes

***All size and VRAM requirements are estimated and theoretical values, your results my vary. Estimates are generated using 71cj34's LLMCalculator v2.0.2 for quantization.

Branch Bits lm_head bits Model Size(GB) Description
8.0_H8 8.0 8 22.6 GB Max quality that ExLlamaV2 can produce, recommended.
8.0_H6 8.0 6 22.4 GB ---
6.5_H8 6.5 8 18.7 GB ---
6.5_H6 6.5 6 18.5 GB ---
5.0_H6 5.0 6 14.7 GB ---
4.25_H6 4.25 6 12.8 GB GPTQ equivalent bits per weight.
3.5_H6 3.5 6 10.8 GB ---

Download instructions

With git with lfs installed:

git clone --single-branch --branch 6.5_H6 https://huggingface.co/DakksNessik/FlareRebellion-WeirdCompound-v1.2-24b-exl2 FlareRebellion-WeirdCompound-v1.2-24b-exl2

With HuggingFace hub CLI.

pip install -U "huggingface-hub[cli]"

To download a specific branch, use the --revision parameter. For example, to download the 6.5 bpw H6 branch:

Linux:

huggingface-cli download DakksNessik/FlareRebellion-WeirdCompound-v1.2-24b-exl2 --revision 6.5_H6 --local-dir FlareRebellion-WeirdCompound-v1.2-24b-exl2_6.5_H6

Windows (which apparently doesn't like _ in folders sometimes?):

huggingface-cli download DakksNessik/FlareRebellion-WeirdCompound-v1.2-24b-exl2 --revision 6.5_H6 --local-dir FlareRebellion-WeirdCompound-v1.2-24b-exl2-6.5_H6


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DakksNessik/FlareRebellion-WeirdCompound-v1.2-24b-exl2

Quantized
(3)
this model