Self-Supervised Speech Models Encode Phonetic Context via Position-dependent Orthogonal Subspaces Paper • 2603.12642 • Published Mar 13 • 1
[b]=[d]-[t]+[p]: Self-supervised Speech Models Discover Phonological Vector Arithmetic Paper • 2602.18899 • Published Mar 12 • 1
BioME: A Resource-Efficient Bioacoustic Foundational Model for IoT Applications Paper • 2602.09970 • Published Feb 10 • 1
POWSM: A Phonetic Open Whisper-Style Speech Foundation Model Paper • 2510.24992 • Published Oct 28, 2025 • 4
Towards Comprehensive Semantic Speech Embeddings for Chinese Dialects Paper • 2601.07274 • Published Jan 12 • 1
OpenBEATs Collection Checkpoints for the WASPAA 2025 paper "OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder" • 87 items • Updated Mar 2 • 5
OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder Paper • 2507.14129 • Published Jul 18, 2025 • 11
IndicGenBench Collection Datasets released in "IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs" (https://arxiv.org/abs/2404.16816) • 4 items • Updated Mar 12 • 13
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages Paper • 2404.16816 • Published Apr 25, 2024 • 3
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published Sep 10, 2024 • 60