12 7 12

Ghosh

Sreyan88

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

submitted a paper 1 day ago

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

liked a dataset 28 days ago

nvidia/MMOU

View all activity

Organizations

upvoted a paper 1 day ago

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Paper • 2604.10905 • Published 2 days ago • 20

submitted a paper to Daily Papers 1 day ago

Audio Flamingo Next: Next-Generation Open Audio-Language Models for Speech, Sound, and Music

Paper • 2604.10905 • Published 2 days ago • 20

liked a dataset 28 days ago

nvidia/MMOU

Viewer • Updated 18 days ago • 15k • 2.16k • 15

authored a paper 29 days ago

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

Paper • 2603.14145 • Published Mar 14 • 14

submitted a paper to Daily Papers 29 days ago

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

Paper • 2603.14145 • Published Mar 14 • 14

liked a Space 3 months ago

Music Flamingo

🎵

168

Analyze music and answer questions from audio or YouTube links

liked a model 3 months ago

nvidia/music-flamingo-2601-hf

Audio-Text-to-Text • 8B • Updated 6 days ago • 55.9k • 93

authored a paper 5 months ago

Music Flamingo: Scaling Music Understanding in Audio Language Models

Paper • 2511.10289 • Published Nov 13, 2025 • 19

commented a paper 5 months ago

Music Flamingo: Scaling Music Understanding in Audio Language Models

Paper • 2511.10289 • Published Nov 13, 2025 • 19 •

authored 3 papers 6 months ago

Audio Flamingo Sound-CoT Technical Report: Improving Chain-of-Thought Reasoning in Sound Understanding

Paper • 2508.11818 • Published Aug 15, 2025

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 92

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 92

upvoted a paper 6 months ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 92

liked a Space 7 months ago

GPT-OSS-120B on AMD MI300X

💻

334

gpt-oss-120b on AMD MI300X GPUs

updated a collection 8 months ago

Audio

Collection

liked a dataset 8 months ago

gamma-lab-umd/MMAU-Pro

Viewer • Updated Aug 28, 2025 • 5.31k • 8k • 17

authored a paper 8 months ago

MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence

Paper • 2508.13992 • Published Aug 19, 2025 • 7

commented a paper 8 months ago

MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence

Paper • 2508.13992 • Published Aug 19, 2025 • 7 •

liked a model 8 months ago

nvidia/audio-flamingo-2-SoundCoT

Audio-Text-to-Text • Updated Aug 28, 2025 • 10

New activity in nvidia/AudioSkills 8 months ago

BBC-Sound-Effect duration doesn't match.

#5 opened 8 months ago by

WhaleDolphin

Ghosh

AI & ML interests

Recent Activity

Organizations

Sreyan88's activity

Music Flamingo

GPT-OSS-120B on AMD MI300X

BBC-Sound-Effect duration doesn't match.