Spaces:

Reza2kn
/

mega-asr-bench

Running

App Files Files Community

mega-asr-bench

Commit History

Clean up debug diagnostics now that WebGPU works end-to-end

2cf4acc
verified

Reza2kn commited on about 10 hours ago

Use INT4 encoder (MatMulNBits — WebGPU-supported); ORT 1.23 keeps webgpu fp16 fix for decoders

b8c2d24
verified

Reza2kn commited on about 10 hours ago

Per-step diagnostics: pinpoint which ORT call crashes

7be4ffd
verified

Reza2kn commited on about 10 hours ago

Surface full error info on transcribe failure

50bb779
verified

Reza2kn commited on about 10 hours ago

Upgrade onnxruntime-web 1.20 → 1.23 (webgpu bundle); revert WASM forcing — 1.23 fixes fp16 transformer NaNs

e6b3306
verified

Reza2kn commited on about 10 hours ago

Force decoders to WASM (WebGPU fp16 in ort-web 1.20 returns all-NaN on this transformer)

d62ebc2
verified

Reza2kn commited on about 10 hours ago

Diagnostics: logit type/len/health/first-8/top-5 with Number() coercion

29d0bb3
verified

Reza2kn commited on about 10 hours ago

Await tensor.getData() for WebGPU outputs (audio_embeds + logits) so data is actually copied back to CPU

1320aec
verified

Reza2kn commited on about 10 hours ago

fp16: use canonical u16 bit-pattern viewed as Float16Array; diagnostic top-5 dump

ed37d13
verified

Reza2kn commited on about 10 hours ago

Encoder: prefer static INT8 (QLinearConv); INT4 fallback. Recovers 92.7%-class quality in browser

9e54b79
verified

Reza2kn commited on about 13 hours ago

Float16Array for fp16 tensors; per-session WebGPU fallback (decoders stay on webgpu)

af482b4
verified

Reza2kn commited on about 13 hours ago

Encoder: try INT8 first, auto-fallback to INT4 if ConvInteger unsupported in browser

a821180
verified

Reza2kn commited on about 13 hours ago

Use ort.webgpu bundle + auto-fallback to wasm if webgpu init fails

96f08e1
verified

Reza2kn commited on about 13 hours ago

Bump cache key to invalidate RTN weights (GPTQ ship)

36417e8
verified

Reza2kn commited on about 15 hours ago

Use INT8 encoder + INT4 decoder (91.9% accuracy); force-English prompt default

61dfe9b
verified

Reza2kn commited on about 18 hours ago

Switch Space to static SDK: pure browser inference via onnxruntime-web

a4c397e
verified

Reza2kn commited on about 18 hours ago

Remove vendor/ (switching to static SDK)

ffd2ab1
verified

Reza2kn commited on about 18 hours ago

Remove requirements.txt (switching to static SDK)

0dfc823
verified

Reza2kn commited on about 18 hours ago

Remove app.py (switching to static SDK)

da91cf8
verified

Reza2kn commited on about 18 hours ago

Switch backend to INT4 ONNX models from Reza2kn/mega-asr-onnx

888006f
verified

Reza2kn commited on about 18 hours ago

Initial: Gradio demo + 8 VITW examples + WER scoring

0c137e3
verified

Reza2kn commited on about 19 hours ago

initial commit

7feced8
verified

Reza2kn commited on about 19 hours ago

Commit History

Clean up debug diagnostics now that WebGPU works end-to-end 2cf4acc verified

Use INT4 encoder (MatMulNBits — WebGPU-supported); ORT 1.23 keeps webgpu fp16 fix for decoders b8c2d24 verified

Per-step diagnostics: pinpoint which ORT call crashes 7be4ffd verified

Surface full error info on transcribe failure 50bb779 verified

Upgrade onnxruntime-web 1.20 → 1.23 (webgpu bundle); revert WASM forcing — 1.23 fixes fp16 transformer NaNs e6b3306 verified

Force decoders to WASM (WebGPU fp16 in ort-web 1.20 returns all-NaN on this transformer) d62ebc2 verified

Diagnostics: logit type/len/health/first-8/top-5 with Number() coercion 29d0bb3 verified

Await tensor.getData() for WebGPU outputs (audio_embeds + logits) so data is actually copied back to CPU 1320aec verified

fp16: use canonical u16 bit-pattern viewed as Float16Array; diagnostic top-5 dump ed37d13 verified

Encoder: prefer static INT8 (QLinearConv); INT4 fallback. Recovers 92.7%-class quality in browser 9e54b79 verified

Float16Array for fp16 tensors; per-session WebGPU fallback (decoders stay on webgpu) af482b4 verified

Encoder: try INT8 first, auto-fallback to INT4 if ConvInteger unsupported in browser a821180 verified

Use ort.webgpu bundle + auto-fallback to wasm if webgpu init fails 96f08e1 verified

Bump cache key to invalidate RTN weights (GPTQ ship) 36417e8 verified

Use INT8 encoder + INT4 decoder (91.9% accuracy); force-English prompt default 61dfe9b verified

Switch Space to static SDK: pure browser inference via onnxruntime-web a4c397e verified

Remove vendor/ (switching to static SDK) ffd2ab1 verified

Remove requirements.txt (switching to static SDK) 0dfc823 verified

Remove app.py (switching to static SDK) da91cf8 verified

Switch backend to INT4 ONNX models from Reza2kn/mega-asr-onnx 888006f verified

Initial: Gradio demo + 8 VITW examples + WER scoring 0c137e3 verified

initial commit 7feced8 verified

Clean up debug diagnostics now that WebGPU works end-to-end

2cf4acc
verified

Use INT4 encoder (MatMulNBits — WebGPU-supported); ORT 1.23 keeps webgpu fp16 fix for decoders

b8c2d24
verified

Per-step diagnostics: pinpoint which ORT call crashes

7be4ffd
verified

Surface full error info on transcribe failure

50bb779
verified

Upgrade onnxruntime-web 1.20 → 1.23 (webgpu bundle); revert WASM forcing — 1.23 fixes fp16 transformer NaNs

e6b3306
verified

Force decoders to WASM (WebGPU fp16 in ort-web 1.20 returns all-NaN on this transformer)

d62ebc2
verified

Diagnostics: logit type/len/health/first-8/top-5 with Number() coercion

29d0bb3
verified

Await tensor.getData() for WebGPU outputs (audio_embeds + logits) so data is actually copied back to CPU

1320aec
verified

fp16: use canonical u16 bit-pattern viewed as Float16Array; diagnostic top-5 dump

ed37d13
verified

Encoder: prefer static INT8 (QLinearConv); INT4 fallback. Recovers 92.7%-class quality in browser

9e54b79
verified

Float16Array for fp16 tensors; per-session WebGPU fallback (decoders stay on webgpu)

af482b4
verified

Encoder: try INT8 first, auto-fallback to INT4 if ConvInteger unsupported in browser

a821180
verified

Use ort.webgpu bundle + auto-fallback to wasm if webgpu init fails

96f08e1
verified

Bump cache key to invalidate RTN weights (GPTQ ship)

36417e8
verified

Use INT8 encoder + INT4 decoder (91.9% accuracy); force-English prompt default

61dfe9b
verified

Switch Space to static SDK: pure browser inference via onnxruntime-web

a4c397e
verified

Remove vendor/ (switching to static SDK)

ffd2ab1
verified

Remove requirements.txt (switching to static SDK)

0dfc823
verified

Remove app.py (switching to static SDK)

da91cf8
verified

Switch backend to INT4 ONNX models from Reza2kn/mega-asr-onnx

888006f
verified

Initial: Gradio demo + 8 VITW examples + WER scoring

0c137e3
verified

initial commit

7feced8
verified