RuntimeError: The size of tensor a (3072) must match the size of tensor b (6144) at non-singleton dimension 1
#5
by lianyouzao - opened
This comment has been hidden (marked as Resolved)
You must run with --disable-shared-experts-fusion in sglang, otherwise it will incorrectly attempt to fuse the BF16 shared expert.