this is not quantized.. actually cant use nv4 on this model

by trohrbaugh - opened Mar 1

Mar 1

Bitsandbytes only quantizes nn.Linear modules. The fused expert tensors are raw nn.Parameter object which get ignored.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment