Add metadata and improve model card

Hi there! I'm Niels, part of the community science team at Hugging Face. I've opened this PR to improve the model card for AROMA.

This PR adds:
- Metadata to the YAML header: `pipeline_tag: text-generation`, `library_name: transformers`, and `base_model: Qwen/Qwen3-8B`.
- Links to the paper on the Hugging Face Hub for better discoverability.
- A BibTeX citation section from your GitHub repository.

These additions help users find your model more easily and provide a standard way for them to cite your work.

Files changed (1) hide show

README.md +21 -3

README.md CHANGED Viewed

@@ -1,3 +1,9 @@
 <p align="center">
   <img src="figures/logo.jpg" alt="AROMA Logo" width="120">
 </p>
@@ -5,11 +11,11 @@
 <h2 align="center"> 🧬 AROMA: Augmented Reasoning Over a Multimodal Architecture for Virtual Cell Genetic Perturbation Modeling<br>(ACL 2026 Findings)</h2>
 <p align="center">
-  📃 <a href="https://arxiv.org/pdf/2604.20263" target="_blank">Paper</a> • 🐙 <a href="https://github.com/blazerye/AROMA" target="_blank">Code</a> • 🗂️ <a href="https://huggingface.co/datasets/blazerye/PerturbReason" target="_blank">Datasets</a><br>
 </p>
 </p>
-> Please refer to our [repository](https://github.com/blazerye/AROMA) and [paper](https://arxiv.org/pdf/2604.20263) for more details.
 ## 🌐 Overview
@@ -25,4 +31,16 @@ The overall AROMA pipeline is illustrated in the figure above and is divided int
 - **Modeling stage.** AROMA adopts a retrieval-augmented strategy to incorporate query-relevant information, thereby providing explicit evidence cues for prediction. In addition, it jointly leverages topological representations learned from graph neural networks (GNN) and protein sequence representations encoded by ESM-2, and applies a cross-attention module to explicitly model perturbation-target gene dependencies across modalities.
-- **Training stage.** AROMA first performs multimodal supervised fine-tuning (SFT), and is then further optimized with Group Relative Policy Optimization (GRPO) reinforcement learning to enhance predictive performance while generating biologically meaningful explanations.

+---
+library_name: transformers
+pipeline_tag: text-generation
+base_model: Qwen/Qwen3-8B
+---
 <p align="center">
   <img src="figures/logo.jpg" alt="AROMA Logo" width="120">
 </p>
 <h2 align="center"> 🧬 AROMA: Augmented Reasoning Over a Multimodal Architecture for Virtual Cell Genetic Perturbation Modeling<br>(ACL 2026 Findings)</h2>
 <p align="center">
+  📃 <a href="https://huggingface.co/papers/2604.20263" target="_blank">Paper</a> • 🐙 <a href="https://github.com/blazerye/AROMA" target="_blank">Code</a> • 🗂️ <a href="https://huggingface.co/datasets/blazerye/PerturbReason" target="_blank">Datasets</a><br>
 </p>
 </p>
+> Please refer to our [repository](https://github.com/blazerye/AROMA) and [paper](https://huggingface.co/papers/2604.20263) for more details.
 ## 🌐 Overview
 - **Modeling stage.** AROMA adopts a retrieval-augmented strategy to incorporate query-relevant information, thereby providing explicit evidence cues for prediction. In addition, it jointly leverages topological representations learned from graph neural networks (GNN) and protein sequence representations encoded by ESM-2, and applies a cross-attention module to explicitly model perturbation-target gene dependencies across modalities.
+- **Training stage.** AROMA first performs multimodal supervised fine-tuning (SFT), and is then further optimized with Group Relative Policy Optimization (GRPO) reinforcement learning to enhance predictive performance while generating biologically meaningful explanations.
+## 📌 Citation
+If you find AROMA useful for your research and applications, please cite using this BibTeX:
+```bibtex
+@inproceedings{wang2026aroma,
+    title="{AROMA}: Augmented Reasoning Over a Multimodal Architecture for Virtual Cell Genetic Perturbation Modeling",
+    author="Wang, Zhenyu and Ye, Geyan and Liu, Wei and Ng, Man Tat Alexander",
+    booktitle="Findings of the Association for Computational Linguistics: ACL 2026",
+    year="2026",
+    publisher="Association for Computational Linguistics"
+}
+```