Clean up debug diagnostics now that WebGPU works end-to-end 2cf4acc verified Reza2kn commited on about 10 hours ago
Use INT4 encoder (MatMulNBits — WebGPU-supported); ORT 1.23 keeps webgpu fp16 fix for decoders b8c2d24 verified Reza2kn commited on about 10 hours ago
Per-step diagnostics: pinpoint which ORT call crashes 7be4ffd verified Reza2kn commited on about 10 hours ago
Surface full error info on transcribe failure 50bb779 verified Reza2kn commited on about 10 hours ago
Upgrade onnxruntime-web 1.20 → 1.23 (webgpu bundle); revert WASM forcing — 1.23 fixes fp16 transformer NaNs e6b3306 verified Reza2kn commited on about 10 hours ago
Force decoders to WASM (WebGPU fp16 in ort-web 1.20 returns all-NaN on this transformer) d62ebc2 verified Reza2kn commited on about 10 hours ago
Diagnostics: logit type/len/health/first-8/top-5 with Number() coercion 29d0bb3 verified Reza2kn commited on about 10 hours ago
Await tensor.getData() for WebGPU outputs (audio_embeds + logits) so data is actually copied back to CPU 1320aec verified Reza2kn commited on about 10 hours ago
fp16: use canonical u16 bit-pattern viewed as Float16Array; diagnostic top-5 dump ed37d13 verified Reza2kn commited on about 10 hours ago
Encoder: prefer static INT8 (QLinearConv); INT4 fallback. Recovers 92.7%-class quality in browser 9e54b79 verified Reza2kn commited on about 13 hours ago
Float16Array for fp16 tensors; per-session WebGPU fallback (decoders stay on webgpu) af482b4 verified Reza2kn commited on about 13 hours ago
Encoder: try INT8 first, auto-fallback to INT4 if ConvInteger unsupported in browser a821180 verified Reza2kn commited on about 13 hours ago
Use ort.webgpu bundle + auto-fallback to wasm if webgpu init fails 96f08e1 verified Reza2kn commited on about 13 hours ago
Bump cache key to invalidate RTN weights (GPTQ ship) 36417e8 verified Reza2kn commited on about 15 hours ago
Use INT8 encoder + INT4 decoder (91.9% accuracy); force-English prompt default 61dfe9b verified Reza2kn commited on about 18 hours ago
Switch Space to static SDK: pure browser inference via onnxruntime-web a4c397e verified Reza2kn commited on about 18 hours ago
Remove requirements.txt (switching to static SDK) 0dfc823 verified Reza2kn commited on about 18 hours ago
Switch backend to INT4 ONNX models from Reza2kn/mega-asr-onnx 888006f verified Reza2kn commited on about 18 hours ago
Initial: Gradio demo + 8 VITW examples + WER scoring 0c137e3 verified Reza2kn commited on about 19 hours ago