shreyask commited on
Commit
3931df6
Β·
verified Β·
1 Parent(s): e93e30a

docs: rename q8 file refs to model_quantized.* (matches transformers.js v4 _quantized suffix for dtype=q8)

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -115,10 +115,10 @@ A complete worked example (React + Vite + WebGPU) lives at [`shreyaskarnik/svara
115
  └── onnx/
116
  β”œβ”€β”€ model_q4f16.onnx (graph, ~1.3 MB)
117
  β”œβ”€β”€ model_q4f16.onnx_data (weights, ~1.95 GB single file)
118
- β”œβ”€β”€ model_q8.onnx (graph, ~1.3 MB)
119
- β”œβ”€β”€ model_q8.onnx_data (weights chunk 1, ~1.99 GB)
120
- β”œβ”€β”€ model_q8.onnx_data_1 (weights chunk 2, ~1.84 GB)
121
- └── model_q8.onnx_data_2 (weights chunk 3, ~0.49 GB)
122
  ```
123
 
124
  q8 is sharded into <2 GB chunks to fit browser ArrayBuffer ceilings (matches the [`onnx-community/gpt-oss-20b-ONNX`](https://huggingface.co/onnx-community/gpt-oss-20b-ONNX) layout convention).
 
115
  └── onnx/
116
  β”œβ”€β”€ model_q4f16.onnx (graph, ~1.3 MB)
117
  β”œβ”€β”€ model_q4f16.onnx_data (weights, ~1.95 GB single file)
118
+ β”œβ”€β”€ model_quantized.onnx (graph, ~1.3 MB)
119
+ β”œβ”€β”€ model_quantized.onnx_data (weights chunk 1, ~1.99 GB)
120
+ β”œβ”€β”€ model_quantized.onnx_data_1 (weights chunk 2, ~1.84 GB)
121
+ └── model_quantized.onnx_data_2 (weights chunk 3, ~0.49 GB)
122
  ```
123
 
124
  q8 is sharded into <2 GB chunks to fit browser ArrayBuffer ceilings (matches the [`onnx-community/gpt-oss-20b-ONNX`](https://huggingface.co/onnx-community/gpt-oss-20b-ONNX) layout convention).