shreyask
/

svara-tts-v1-ONNX

@@ -115,10 +115,10 @@ A complete worked example (React + Vite + WebGPU) lives at [`shreyaskarnik/svara
 └── onnx/
     ├── model_q4f16.onnx         (graph, ~1.3 MB)
     ├── model_q4f16.onnx_data    (weights, ~1.95 GB single file)
-    ├── model_q8.onnx            (graph, ~1.3 MB)
-    ├── model_q8.onnx_data       (weights chunk 1, ~1.99 GB)
-    ├── model_q8.onnx_data_1     (weights chunk 2, ~1.84 GB)
-    └── model_q8.onnx_data_2     (weights chunk 3, ~0.49 GB)
 ```
 q8 is sharded into <2 GB chunks to fit browser ArrayBuffer ceilings (matches the [`onnx-community/gpt-oss-20b-ONNX`](https://huggingface.co/onnx-community/gpt-oss-20b-ONNX) layout convention).

 └── onnx/
     ├── model_q4f16.onnx         (graph, ~1.3 MB)
     ├── model_q4f16.onnx_data    (weights, ~1.95 GB single file)
+    ├── model_quantized.onnx            (graph, ~1.3 MB)
+    ├── model_quantized.onnx_data       (weights chunk 1, ~1.99 GB)
+    ├── model_quantized.onnx_data_1     (weights chunk 2, ~1.84 GB)
+    └── model_quantized.onnx_data_2     (weights chunk 3, ~0.49 GB)
 ```
 q8 is sharded into <2 GB chunks to fit browser ArrayBuffer ceilings (matches the [`onnx-community/gpt-oss-20b-ONNX`](https://huggingface.co/onnx-community/gpt-oss-20b-ONNX) layout convention).