Works well on RTX 3080 ti 12 GB VRAM

by gozzima - opened Feb 16

Feb 16

I have just tested the model on my RTX 3080 ti with 12 GB VRAM and it works well. It is really impressive compared to usual voice assistant which are not full duplex and give not the same natural language feeling.
The only thing that did not work was the use of pre-quantized weights (model_bnb_4bit.pt).
I even tried to DL the quantization file via "huggingface-cli download" without success.

brianmatzelle

Owner Feb 20

Appreciate the feedback, I'll look into it and update the repo with a fix soon.

moeinsadeghi

Feb 20

I tried to use .pt file but got this error

RuntimeError: Error(s) in loading state_dict for LMModel:
        size mismatch for transformer.layers.0.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.0.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.0.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.0.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.1.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.1.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.1.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.1.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.2.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.2.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.2.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.2.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.3.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.3.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.3.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.3.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.4.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.4.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.4.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.4.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.5.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.5.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.5.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.5.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.6.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.6.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.6.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.6.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.7.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.7.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.7.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.7.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.8.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.8.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.8.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.8.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.9.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.9.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.9.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.9.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.10.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.10.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.10.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.10.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.11.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.11.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.11.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.11.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.12.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.12.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.12.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.12.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.13.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.13.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.13.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.13.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.14.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.14.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.14.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.14.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.15.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.15.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.15.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.15.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.16.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.16.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.16.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.16.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.17.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.17.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.17.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.17.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.18.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.18.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.18.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.18.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.19.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.19.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.19.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.19.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.20.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.20.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.20.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.20.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.21.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.21.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.21.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.21.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.22.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.22.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.22.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.22.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.23.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.23.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.23.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.23.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.24.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.24.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.24.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.24.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.25.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.25.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.25.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.25.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.26.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.26.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.26.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.26.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.27.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.27.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.27.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.27.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.28.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.28.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.28.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.28.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.29.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.29.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.29.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.29.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.30.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.30.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.30.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.30.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).
        size mismatch for transformer.layers.31.self_attn.in_proj.weight: copying a param with shape torch.Size([25165824, 1]) from checkpoint, the shape in current model is torch.Size([12288, 4096]).
        size mismatch for transformer.layers.31.self_attn.out_proj.weight: copying a param with shape torch.Size([8388608, 1]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
        size mismatch for transformer.layers.31.gating.linear_in.weight: copying a param with shape torch.Size([46137344, 1]) from checkpoint, the shape in current model is torch.Size([22528, 4096]).
        size mismatch for transformer.layers.31.gating.linear_out.weight: copying a param with shape torch.Size([23068672, 1]) from checkpoint, the shape in current model is torch.Size([4096, 11264]).

brianmatzelle

Owner Feb 23

•

edited Feb 23

Hey @gozzima and @moeinsadeghi — this should be fixed now.

The issue was that the loader didn't have a code path for pre-quantized bitsandbytes checkpoints. When you passed the .pt file, it tried to load the packed 4-bit weights (shape [25165824, 1]) directly into the model expecting bf16 shapes ([12288, 4096]) — hence the size mismatch errors.

The moshi/ source in this repo has been updated with a fix. The loader now auto-detects pre-quantized checkpoints and reconstructs the 4-bit weights properly without re-quantizing.

To use pre-quantized weights:

Re-clone or pull the latest moshi/ source from this repo

git clone https://huggingface.co/brianmatzelle/personaplex-7b-v1-bnb-4bit
cd personaplex-7b-v1-bnb-4bit
pip install moshi/.

Run with the pre-quantized .pt file

python -m moshi.offline \
  --moshi-weight model_bnb_4bit.pt \
  --quantize-4bit \
  --voice-prompt "NATF2.pt" \
  --input-wav "assets/test/input_assistant.wav" \
  --output-wav "output.wav" \
  --output-text "output.json"

Or for the live server:

SSL_DIR=$(mktemp -d)
python -m moshi.server --ssl "$SSL_DIR" --quantize-4bit --moshi-weight model_bnb_4bit.pt

The key is passing both --moshi-weight model_bnb_4bit.pt and --quantize-4bit together. The loader detects the bitsandbytes metadata in the checkpoint and skips re-quantization automatically.

Let me know if you run into any other issues!

brianmatzelle changed discussion status to closed Feb 23

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment