MOOSE-Star-IR-R1D-7B Model Card

Overview

MOOSE-Star-IR-R1D-7B (referred to as MS-IR-7B in the paper) is a 7B parameter model fine-tuned for selecting the correct cross-paper inspiration from 15 candidates given a research background. It's designed for scientific hypothesis generation in the MOOSE-Star framework.

Paper: MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier (arXiv:2603.03756)
Base Model: DeepSeek-R1-Distill-Qwen-7B
License: Apache 2.0
Code: ZonglinY/MOOSE-Star
Multi-task variant: MOOSE-Star-R1D-7B (IR + HC in one model)

Model Description

Parameter	Value
Base Model	DeepSeek-R1-Distill-Qwen-7B
Training Method	Full-parameter SFT (ZeRO-3)
Training Data	TOMATO-Star-SFT-Data-R1D-32B IR split (150,218 train + 2,377 eval)
Teacher Model	DeepSeek-R1-Distill-Qwen-32B
Learning Rate	1e-5
Epochs	1
Batch Size	128
Chat Template	deepseekr1
Cutoff Length	16384

Task Description

The model selects the most relevant cross-paper inspiration from 15 candidates (A-O) that includes:

1 correct inspiration (ground truth)
14 hard negatives (keyword-similar, embedding-similar, and random papers)

The model outputs chain-of-thought reasoning and is designed for a hierarchical search pipeline with O(log N) complexity.

Prompt Format (Simplified Overview)

The full prompt template is constructed via instruction_prompts() in the code examples below. The general structure is:

[Task instruction preamble]

## Context

**Research Question:**
{research_question}

**Background Survey (existing methods for THIS task):**
{background_survey}

**Previous Hypothesis (if any):**
{previous_hypothesis_or_none}

## Candidate Inspiration Papers

### Candidate [A]
**Title:** {title_A}
**Abstract:** {abstract_A}

... (15 candidates total, A through O)

## Output Format

<think>
[reasoning process]
</think>

**Selected ID starts:** [X] **Selected ID ends**

**Selection Reason starts:** [reason] **Selection Reason ends**

Usage

Prerequisites: Clone the MOOSE-Star repo for prompt templates and inference utilities:

git clone https://github.com/ZonglinY/MOOSE-Star.git && cd MOOSE-Star
# See requirements.txt for full dependencies; at minimum: pip install transformers torch

Option A: SGLang Deployment (Recommended)

# SGLang requires a separate environment; see https://github.com/sgl-project/sglang for installation
# Start the server
python -m sglang.launch_server --model-path ZonglinY/MOOSE-Star-IR-R1D-7B --port 1235

import sys
sys.path.insert(0, "./Inference")
from ir_probability_extractor import IRProbabilityExtractor

extractor = IRProbabilityExtractor(base_urls=["http://localhost:1235/v1"])
result = extractor.get_selection_probabilities(
    research_question="Your research question",
    background_survey="Your background survey",
    candidates=[
        {"title": "Candidate A title", "abstract": "Candidate A abstract"},
        {"title": "Candidate B title", "abstract": "Candidate B abstract"},
        # ... up to 15 candidates (labeled A-O)
    ],
)
print(f"Selected: [{result.selected_label}]")
print(f"Probabilities: {result.probabilities}")

Option B: Direct HuggingFace Inference

import sys
sys.path.insert(0, "./utils")
from prompt_store import instruction_prompts
from transformers import AutoModelForCausalLM, AutoTokenizer
import re

model_name = "ZonglinY/MOOSE-Star-IR-R1D-7B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, dtype="auto", device_map="auto")

p = instruction_prompts("inspiration_retrieval_with_reasoning_with_alphabetical_candidates")

candidates = [{"title": "...", "abstract": "..."}, ...]
candidates_text = "".join(
    f"### Candidate [{chr(ord('A') + i)}]\n**Title:** {c['title']}\n**Abstract:** {c['abstract']}\n\n"
    for i, c in enumerate(candidates)
)

research_question = "Your research question"
background_survey = "Your background survey"
prompt = (p[0] + research_question
        + p[1] + background_survey
        + p[2] + "No previous hypothesis."
        + p[3] + candidates_text
        + p[4])

messages = [{"role": "user", "content": prompt}]
formatted = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
formatted += "<｜Assistant｜>"

inputs = tokenizer(formatted, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.6, top_p=0.9, do_sample=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

# Parse selected candidate
match = re.search(r"\*\*Selected ID starts:\*\*\s*\[(\w)\]\s*\*\*Selected ID ends\*\*", response)
if match:
    selected = match.group(1)
    print(f"Selected: [{selected}]")

Evaluation Results

Model	Accuracy
Random Selection	6.70%
R1-Distilled-Qwen-7B (base)	28.42%
MS-IR-7B (this model)	54.37%

Citation

@article{yang2025moosestar,
  title={MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier},
  author={Yang, Zonglin and Bing, Lidong},
  journal={arXiv preprint arXiv:2603.03756},
  year={2026}
}