Commit History

docs: rename q8 file refs to model_quantized.* (matches transformers.js v4 _quantized suffix for dtype=q8)
3931df6
verified

shreyask commited on

rename q8 -> quantized to match transformers.js v4 dtype="q8" filename suffix _quantized
e93e30a
verified

shreyask commited on

docs: add q8 sharded variant + correct prompt format + temp guidance
8c95a57
verified

shreyask commited on

Reshard q8 external data into <2GB chunks (matches onnx-community pattern)
b8b6b42
verified

shreyask commited on

Add q8 (MatMulNBits 8-bit, block=128) variant alongside q4f16
f8ed441
verified

shreyask commited on

Re-quantize with block_size=128 + quantize Gather; reduces external data 2.83 GB -> 1.95 GB to fit browser 2 GB ceiling
a0bbe7b
verified

shreyask commited on

Initial q4f16 ONNX export of kenpath/svara-tts-v1
c97aa31
verified

shreyask commited on

initial commit
cfb0dce
verified

shreyask commited on