Add pipeline tag, library name, and paper link to model card
Browse filesHi! I'm Niels from the Hugging Face community science team.
This PR improves the metadata and content of your model card:
- Adds the `audio-text-to-text` pipeline tag to ensure the model is correctly categorized.
- Adds `library_name: transformers` metadata since the model uses the Transformers library.
- Links the model card to its research paper page on Hugging Face: [EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning](https://huggingface.co/papers/2601.15668).
These changes help users find and use your work more effectively.
README.md
CHANGED
|
@@ -1,13 +1,16 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- Qwen/Qwen2.5-Omni-7B
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
| 8 |
|
| 9 |
# EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning
|
| 10 |
|
|
|
|
| 11 |
|
| 12 |
[](https://arxiv.org/pdf/2601.15668) [](https://github.com/dingdongwang/EmotionThinker)
|
| 13 |
|
|
@@ -16,7 +19,7 @@ base_model:
|
|
| 16 |
</p>
|
| 17 |
|
| 18 |
## Introduction
|
| 19 |
-
EmotionThinker is the first RL–enhanced SpeechLLM framework for interpretable speech emotion reasoning. For details, please refer to the [paper](https://
|
| 20 |
|
| 21 |
Unlike conventional speech emotion recognition (SER) systems that treat emotion as a flat classification problem, EmotionThinker reframes SER as a deep reasoning problem, enabling models to jointly produce accurate emotion labels and structured, human-aligned explanations.
|
| 22 |
|
|
@@ -29,7 +32,7 @@ EmotionThinker offers the following advantages:
|
|
| 29 |
|
| 30 |
## Quickstart
|
| 31 |
|
| 32 |
-
```
|
| 33 |
import torch
|
| 34 |
from transformers import Qwen2_5OmniForConditionalGeneration, Qwen2_5OmniProcessor
|
| 35 |
from qwen_omni_utils import process_mm_info
|
|
@@ -69,12 +72,11 @@ with torch.no_grad():
|
|
| 69 |
|
| 70 |
text = processor.batch_decode(text_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
|
| 71 |
print(text)
|
| 72 |
-
|
| 73 |
```
|
| 74 |
|
| 75 |
## Citation
|
| 76 |
If you find this model useful in your research, please kindly cite:
|
| 77 |
-
```
|
| 78 |
@inproceedings{wang2026emotionthinker,
|
| 79 |
title={EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning},
|
| 80 |
author={Wang, Dingdong and Liu, Shujie and Zhang, Tianhua and Chen, Youjun and Li, Jinyu and Meng, Helen},
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- Qwen/Qwen2.5-Omni-7B
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: apache-2.0
|
| 7 |
+
library_name: transformers
|
| 8 |
+
pipeline_tag: audio-text-to-text
|
| 9 |
---
|
| 10 |
|
| 11 |
# EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning
|
| 12 |
|
| 13 |
+
This repository contains the model presented in the paper [EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning](https://huggingface.co/papers/2601.15668).
|
| 14 |
|
| 15 |
[](https://arxiv.org/pdf/2601.15668) [](https://github.com/dingdongwang/EmotionThinker)
|
| 16 |
|
|
|
|
| 19 |
</p>
|
| 20 |
|
| 21 |
## Introduction
|
| 22 |
+
EmotionThinker is the first RL–enhanced SpeechLLM framework for interpretable speech emotion reasoning. For details, please refer to the [paper](https://huggingface.co/papers/2601.15668).
|
| 23 |
|
| 24 |
Unlike conventional speech emotion recognition (SER) systems that treat emotion as a flat classification problem, EmotionThinker reframes SER as a deep reasoning problem, enabling models to jointly produce accurate emotion labels and structured, human-aligned explanations.
|
| 25 |
|
|
|
|
| 32 |
|
| 33 |
## Quickstart
|
| 34 |
|
| 35 |
+
```python
|
| 36 |
import torch
|
| 37 |
from transformers import Qwen2_5OmniForConditionalGeneration, Qwen2_5OmniProcessor
|
| 38 |
from qwen_omni_utils import process_mm_info
|
|
|
|
| 72 |
|
| 73 |
text = processor.batch_decode(text_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
|
| 74 |
print(text)
|
|
|
|
| 75 |
```
|
| 76 |
|
| 77 |
## Citation
|
| 78 |
If you find this model useful in your research, please kindly cite:
|
| 79 |
+
```bibtex
|
| 80 |
@inproceedings{wang2026emotionthinker,
|
| 81 |
title={EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning},
|
| 82 |
author={Wang, Dingdong and Liu, Shujie and Zhang, Tianhua and Chen, Youjun and Li, Jinyu and Meng, Helen},
|