PatRe: A Full-Stage Office Action and Rebuttal Generation Benchmark for Patent Examination
Abstract
PatRe benchmark models the complete patent examination process as a dynamic, multi-turn interaction between examiners and applicants, revealing key performance differences among LLMs in legal reasoning and technical novelty assessment.
Patent examination is a complex, multi-stage process requiring both technical expertise and legal reasoning, increasingly challenged by rising application volumes. Prior benchmarks predominantly view patent examination as discriminative classification or static extraction, failing to capture its inherently interactive and iterative nature, similar to the peer review and rebuttal process in academic publishing. In this paper, we introduce PatRe, the first benchmark that models the full patent examination lifecycle, including Office Action generation and applicant rebuttal. PatRe comprises 480 real-world cases and supports both oracle and retrieval-simulated evaluation settings. Our benchmark reframes patent examination as a dynamic, multi-turn process of justification and response. Extensive experiments across various LLMs reveal critical insights into model performance, including differences between proprietary and open-source models, as well as task asymmetries between examiner analysis and applicant-side rebuttal. These findings highlight both the potential and current limitations of LLMs in modeling complex, real-world legal reasoning and technical novelty judgment in patent examination. We release our code and dataset to facilitate future research on patent examination modeling.
Community
PatRe is the first benchmark to model the full patent examination lifecycle as an interactive, multi-turn process between examiner and applicant.
It captures real-world dynamics such as Office Action generation and rebuttal, supporting both oracle and retrieval-based evaluation settings to assess iterative legal and technical reasoning.
If you find our work interesting, we would really appreciate your support and upvote! 🌿🚀
PatRe is open-sourced at https://github.com/AIforIP/PatRe
Project page: https://patre.wangqiyao.me/
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Adaptive Cost-Efficient Evaluation for Reliable Patent Claim Validation (2026)
- MetaGAI: A Large-Scale and High-Quality Benchmark for Generative AI Model and Data Card Generation (2026)
- Beyond Rating: A Comprehensive Evaluation and Benchmark for AI Reviews (2026)
- NoveltyAgent: Autonomous Novelty Reporting Agent with Point-wise Novelty Analysis and Self-Validation (2026)
- NyayaMind- A Framework for Transparent Legal Reasoning and Judgment Prediction in the Indian Legal System (2026)
- TriBench-Ko: Evaluating LLM Risks in Judicial Workflows (2026)
- VERDICT: Verifiable Evolving Reasoning with Directive-Informed Collegial Teams for Legal Judgment Prediction (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2605.03571 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper