Instructions to use shanyangmie/physics-r1-seed17-v4-step60-fsdp with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use shanyangmie/physics-r1-seed17-v4-step60-fsdp with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="shanyangmie/physics-r1-seed17-v4-step60-fsdp")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("shanyangmie/physics-r1-seed17-v4-step60-fsdp", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use shanyangmie/physics-r1-seed17-v4-step60-fsdp with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "shanyangmie/physics-r1-seed17-v4-step60-fsdp" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shanyangmie/physics-r1-seed17-v4-step60-fsdp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/shanyangmie/physics-r1-seed17-v4-step60-fsdp
- SGLang
How to use shanyangmie/physics-r1-seed17-v4-step60-fsdp with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "shanyangmie/physics-r1-seed17-v4-step60-fsdp" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shanyangmie/physics-r1-seed17-v4-step60-fsdp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "shanyangmie/physics-r1-seed17-v4-step60-fsdp" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shanyangmie/physics-r1-seed17-v4-step60-fsdp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use shanyangmie/physics-r1-seed17-v4-step60-fsdp with Docker Model Runner:
docker model run hf.co/shanyangmie/physics-r1-seed17-v4-step60-fsdp
Physics-R1 β Seed 17, v4 step-60 (FSDP-sharded)
Project Page | Paper | Code | Training corpus
Physics-R1 fine-tune of Qwen3-VL-8B-Thinking on the audited PhysR1Corp (2,268 closed-form physics problems) via full-parameter FSDP1 GRPO with binary correctness reward. This is the seed-17 v4 (audited-data) re-validation checkpoint at step 60.
Released alongside Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning.
Which checkpoint should you use?
| Checkpoint | Use for | Notes |
|---|---|---|
physics-r1-seed17-canonical-step63-fsdp |
Exact reproduction of paper Table 2 seed-17 row | Canonical paper checkpoint (step 63) |
physics-r1-seed17-v4-step60-fsdp |
Re-validation on audited corpus | This card β v4 audited re-run, tracks canonical closely |
physics-r1-seed17-v4-step50-fsdp |
Step ablation | Same run, earlier step |
physics-r1-seed17-v4-step40-fsdp |
Step ablation | Same run, earlier step |
physics-r1-seed42-v4-step60-fsdp |
Paper Table 2 seed-42 row | Step-60 binary, seed 42 |
physics-r1-seed23-canonical-step60-fsdp |
Paper Table 2 seed-23 row | Canonical step-60, seed 23 |
On the relationship to Table 2: the paper's seed-17 row (PhysReason 43.1, PhysOlym-A 25.0, PhyX-3k 77.2, ...) is from the canonical step-63 checkpoint. This v4 step-60 checkpoint is a re-validation on the audited 2,268-record PhysR1Corp; its step-60 mean tracks the canonical mean within statistical noise. For exact paper-reproduction numbers, use the canonical checkpoint.
Training recipe
- Base model:
Qwen/Qwen3-VL-8B-Thinking - Algorithm: GRPO (verl 0.6.1, full-parameter FSDP1 β
actor.strategy=fsdp, notfsdp2; FSDP2 fails on Qwen3-VL visual encoder device placement) - Reward: binary correctness, per-subpart Sonnet judge with problem-level AND aggregation (see paper Β§3.2)
- Data:
shanyangmie/physr1corpβ 2,268 audited closed-form problems - Seed / step: 17 / 60
- Hardware: 4ΓH200 (FSDP1 4-way sharded)
Full hyperparameters are in the paper appendix.
Format: verl FSDP-sharded checkpoint (conversion required)
This checkpoint is saved in verl's FSDP-sharded format, not safetensors. It is not directly loadable via AutoModelForImageTextToText.from_pretrained without a merge step.
File layout
actor/
βββ huggingface/ # HF-style config + tokenizer
β βββ config.json
β βββ tokenizer.json, merges.txt, vocab.json
β βββ preprocessor_config.json
β βββ ...
βββ model_world_size_4_rank_{0,1,2,3}.pt # 4-way FSDP weight shards (~8.7 GB each, ~35 GB total)
βββ optim_world_size_4_rank_{0,1,2,3}.pt # optimizer state (~17.5 GB each, not needed for inference)
βββ extra_state_world_size_4_rank_{0..3}.pt
βββ fsdp_config.json
data.pt # verl bookkeeping (not needed for inference)
Convert to HF safetensors
Use verl's model_merger.py:
git clone https://github.com/volcengine/verl
cd verl
# Download only the inference-required files (skips ~70 GB of optimizer state)
huggingface-cli download shanyangmie/physics-r1-seed17-v4-step60-fsdp \
--include "actor/model_world_size_4_rank_*.pt" \
--include "actor/huggingface/*" \
--include "actor/fsdp_config.json" \
--include "actor/extra_state_world_size_4_rank_*.pt" \
--local-dir ./ckpt
# Merge FSDP shards into HF safetensors
python scripts/model_merger.py merge \
--backend fsdp \
--hf_model_path Qwen/Qwen3-VL-8B-Thinking \
--local_dir ./ckpt/actor \
--target_dir ./physics-r1-seed17-v4-step60-hf
Then load with standard HF:
from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
"./physics-r1-seed17-v4-step60-hf",
torch_dtype="bfloat16",
device_map="auto",
)
processor = AutoProcessor.from_pretrained("./physics-r1-seed17-v4-step60-hf")
License
Apache 2.0, inheriting from the base model Qwen3-VL-8B-Thinking. Training data (physr1corp) is CC BY-NC 4.0, so this derivative checkpoint is intended for non-commercial research use.
Citation
@misc{yang2026physicsr1,
title = {Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning},
author = {Yang, Shan},
year = {2026},
url = {https://huggingface.co/papers/2605.14040}
}
Model tree for shanyangmie/physics-r1-seed17-v4-step60-fsdp
Base model
Qwen/Qwen3-VL-8B-Thinking