this is not quantized.. actually cant use nv4 on this model
#1
by trohrbaugh - opened
Bitsandbytes only quantizes nn.Linear modules. The fused expert tensors are raw nn.Parameter object which get ignored.
Bitsandbytes only quantizes nn.Linear modules. The fused expert tensors are raw nn.Parameter object which get ignored.