this is not quantized.. actually cant use nv4 on this model

#1
by trohrbaugh - opened

Bitsandbytes only quantizes nn.Linear modules. The fused expert tensors are raw nn.Parameter object which get ignored.

Sign up or log in to comment