| --- |
| library_name: pytorch |
| base_model: microsoft/deberta-v3-large |
| tags: |
| - retrieval-augmented-generation |
| - reranking |
| - robust-retrieval |
| - evidence-critic |
| - corm-rag |
| - arxiv:2605.01302 |
| --- |
| |
| # CoRM-RAG Evidence Critic |
|
|
| This repository hosts the released Evidence Critic checkpoint for **CoRM-RAG**: |
|
|
| **Beyond Semantic Relevance: Counterfactual Risk Minimization for Robust Retrieval-Augmented Generation** |
| Peiyang Liu, Qiang Yan, Ziqiang Cui, Di Liang, Xi Wang, Wei Ye |
| arXiv: <https://arxiv.org/abs/2605.01302> |
|
|
| Code: <https://github.com/PeiYangLiu/CoRM-RAG> |
|
|
| ## Model Description |
|
|
| CoRM-RAG aligns retrieval with decision safety rather than semantic similarity alone. The Evidence Critic is a lightweight reranking model trained to score whether a document remains useful under cognitively biased query perturbations, such as false premises, confirmation bias, and distracting assumptions. |
|
|
| The released checkpoint uses a `microsoft/deberta-v3-large` backbone and outputs a robustness score for a `(query, document)` pair. It is intended to be used inside the CoRM-RAG pipeline for evidence reranking and risk-aware retrieval. |
|
|
| ## Files |
|
|
| ```text |
| critic-v12-mixed/checkpoint-latest/state.pt |
| ``` |
|
|
| This file is a PyTorch checkpoint consumed by the CoRM-RAG codebase. |
|
|
| ## Usage |
|
|
| Install the code from GitHub and download the checkpoint: |
|
|
| ```bash |
| git clone https://github.com/PeiYangLiu/CoRM-RAG.git |
| cd CoRM-RAG |
| |
| huggingface-cli download PeiyangLiu/CoRM-RAG \ |
| critic-v12-mixed/checkpoint-latest/state.pt \ |
| --local-dir checkpoints/hf |
| ``` |
|
|
| Run evaluation by pointing `CRITIC_PATH` to the downloaded checkpoint: |
|
|
| ```bash |
| CRITIC_PATH=checkpoints/hf/critic-v12-mixed/checkpoint-latest/state.pt bash src/run_eval.sh |
| ``` |
|
|
| For training-data construction, critic training, and end-to-end evaluation details, see the GitHub repository. |
|
|
| ## Intended Use |
|
|
| This checkpoint is intended for research on robust retrieval-augmented generation, evidence reranking, and risk-aware retrieval under biased or perturbed user queries. It is not a standalone generative model. |
|
|
| ## Limitations |
|
|
| The critic score reflects robustness patterns learned from the CoRM-RAG training pipeline and should be interpreted within that retrieval setting. Performance may vary across domains, corpora, retrievers, and perturbation distributions. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{liu2026cormrag, |
| title={Beyond Semantic Relevance: Counterfactual Risk Minimization for Robust Retrieval-Augmented Generation}, |
| author={Peiyang Liu and Qiang Yan and Ziqiang Cui and Di Liang and Xi Wang and Wei Ye}, |
| year={2026}, |
| eprint={2605.01302}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CL}, |
| url={https://arxiv.org/abs/2605.01302} |
| } |
| ``` |
|
|