Exllama v2 Quantizations of WeirdCompound-v1.2-24b
Using turboderp's ExLlamaV2 v0.3.2 for quantization.
The "main" branch only contains the measurement.json, download one of the other branches for the model (see below)
Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
Original model: FlareRebellion/WeirdCompound-v1.2-24b
Available Quantization sizes
***All size and VRAM requirements are estimated and theoretical values, your results my vary. Estimates are generated using 71cj34's LLMCalculator v2.0.2 for quantization.
| Branch | Bits | lm_head bits | Model Size(GB) | Description |
|---|---|---|---|---|
| 8.0_H8 | 8.0 | 8 | 22.6 GB | Max quality that ExLlamaV2 can produce, recommended. |
| 8.0_H6 | 8.0 | 6 | 22.4 GB | --- |
| 6.5_H8 | 6.5 | 8 | 18.7 GB | --- |
| 6.5_H6 | 6.5 | 6 | 18.5 GB | --- |
| 5.0_H6 | 5.0 | 6 | 14.7 GB | --- |
| 4.25_H6 | 4.25 | 6 | 12.8 GB | GPTQ equivalent bits per weight. |
| 3.5_H6 | 3.5 | 6 | 10.8 GB | --- |
Download instructions
With git with lfs installed:
git clone --single-branch --branch 6.5_H6 https://huggingface.co/DakksNessik/FlareRebellion-WeirdCompound-v1.2-24b-exl2 FlareRebellion-WeirdCompound-v1.2-24b-exl2
With HuggingFace hub CLI.
pip install -U "huggingface-hub[cli]"
To download a specific branch, use the --revision parameter. For example, to download the 6.5 bpw H6 branch:
Linux:
huggingface-cli download DakksNessik/FlareRebellion-WeirdCompound-v1.2-24b-exl2 --revision 6.5_H6 --local-dir FlareRebellion-WeirdCompound-v1.2-24b-exl2_6.5_H6
Windows (which apparently doesn't like _ in folders sometimes?):
huggingface-cli download DakksNessik/FlareRebellion-WeirdCompound-v1.2-24b-exl2 --revision 6.5_H6 --local-dir FlareRebellion-WeirdCompound-v1.2-24b-exl2-6.5_H6
Model tree for DakksNessik/FlareRebellion-WeirdCompound-v1.2-24b-exl2
Base model
FlareRebellion/WeirdCompound-v1.2-24b