| pipeline_tag: image-to-image | |
| # M2Retinexformer | |
| This repository contains the official weights for **M2Retinexformer** (Multi-Modal Retinexformer), introduced in the paper [M2Retinexformer: Multi-Modal Retinexformer for Low-Light Image Enhancement](https://huggingface.co/papers/2605.12556). | |
| - **Paper:** [M2Retinexformer: Multi-Modal Retinexformer for Low-Light Image Enhancement](https://huggingface.co/papers/2605.12556) | |
| - **Code:** [GitHub Repository](https://github.com/YoussefAboelwafa/M2Retinexformer) | |
| ## Introduction | |
| Low-light image enhancement is challenging due to complex degradations, including amplified noise, artifacts, and color distortion. M2Retinexformer is a novel framework that extends [Retinexformer](https://arxiv.org/abs/2303.06705) by incorporating **depth cues**, **luminance priors**, and **semantic features** within a progressive refinement pipeline. | |
| Depth provides geometric context invariant to lighting variations, while luminance and semantic features offer explicit guidance on brightness distribution and scene understanding. These modalities are fused through cross-attention with adaptive gating to dynamically balance illumination-guided self-attention and cross-attention based on the reliability of auxiliary cues. | |
| ## Citation | |
| If you find this work useful, please cite: | |
| ```bibtex | |
| @misc{aboelwafa2026m2retinexformermultimodalretinexformerlowlight, | |
| title={M2Retinexformer: Multi-Modal Retinexformer for Low-Light Image Enhancement}, | |
| author={Youssef Aboelwafa and Hicham G. Elmongui and Marwan Torki}, | |
| year={2026}, | |
| eprint={2605.12556}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV}, | |
| url={https://arxiv.org/abs/2605.12556}, | |
| } | |
| ``` | |
| ## Acknowledgements | |
| This project is built on the baseline architecture of [Retinexformer](https://github.com/caiyuanhao1998/Retinexformer). |