Instructions to use shanyangmie/physics-r1-seed17-v4-step40-fsdp with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use shanyangmie/physics-r1-seed17-v4-step40-fsdp with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="shanyangmie/physics-r1-seed17-v4-step40-fsdp")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("shanyangmie/physics-r1-seed17-v4-step40-fsdp", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use shanyangmie/physics-r1-seed17-v4-step40-fsdp with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "shanyangmie/physics-r1-seed17-v4-step40-fsdp" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shanyangmie/physics-r1-seed17-v4-step40-fsdp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/shanyangmie/physics-r1-seed17-v4-step40-fsdp
- SGLang
How to use shanyangmie/physics-r1-seed17-v4-step40-fsdp with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "shanyangmie/physics-r1-seed17-v4-step40-fsdp" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shanyangmie/physics-r1-seed17-v4-step40-fsdp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "shanyangmie/physics-r1-seed17-v4-step40-fsdp" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "shanyangmie/physics-r1-seed17-v4-step40-fsdp", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use shanyangmie/physics-r1-seed17-v4-step40-fsdp with Docker Model Runner:
docker model run hf.co/shanyangmie/physics-r1-seed17-v4-step40-fsdp
Physics-R1 β Seed 17, v4 step-40 (FSDP-sharded)
Project Page | Paper | Code | Training corpus
Step-40 intermediate checkpoint from the seed-17 v4 (audited-data) Physics-R1 training run. Released for training-curve ablations and reproducibility. Fine-tune of Qwen3-VL-8B-Thinking on the audited PhysR1Corp (2,268 closed-form physics problems) via full-parameter FSDP1 GRPO with binary correctness reward.
Released alongside Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning.
Performance
Intermediate-step checkpoint released for training-curve ablations. Not the paper-headline checkpoint β for exact Table 2 numbers see:
- Seed-42 row:
physics-r1-seed42-v4-step60-fsdp - Seed-17 row:
physics-r1-seed17-canonical-step63-fsdp - Seed-23 row:
physics-r1-seed23-canonical-step60-fsdp
Variants
| Checkpoint | Used for | Notes |
|---|---|---|
physics-r1-seed42-v4-step60-fsdp |
Paper Table 2 seed-42 row | Binary, step 60 |
physics-r1-seed17-canonical-step63-fsdp |
Paper Table 2 seed-17 row | Binary, step 63 |
physics-r1-seed23-canonical-step60-fsdp |
Paper Table 2 seed-23 row | Binary, step 60 |
physics-r1-seed17-v4-step60-fsdp |
Seed-17 v4 re-validation, tracks canonical | Binary, step 60 |
physics-r1-seed42-v4-step{40,50}-fsdp |
Step ablation (seed 42) | Earlier steps |
physics-r1-seed17-v4-step{40,50}-fsdp |
Step ablation (seed 17) | Earlier steps |
Training recipe
- Base model:
Qwen/Qwen3-VL-8B-Thinking - Algorithm: GRPO (verl 0.6.1, full-parameter FSDP1 β
actor.strategy=fsdp, notfsdp2; FSDP2 fails on Qwen3-VL visual encoder device placement) - Reward: binary correctness, per-subpart Sonnet judge with problem-level AND aggregation (see paper Β§3.2)
- Data:
shanyangmie/physr1corpβ 2,268 audited closed-form problems - Hardware: 4ΓH200 (FSDP1 4-way sharded)
Full hyperparameters in the paper appendix.
- Seed / step: 17 / 40
Format: verl FSDP-sharded checkpoint (conversion required)
This checkpoint is saved in verl's FSDP-sharded format, not safetensors. It is not directly loadable via AutoModelForImageTextToText.from_pretrained without a merge step.
File layout
actor/
βββ huggingface/ # HF-style config + tokenizer
βββ model_world_size_4_rank_{0,1,2,3}.pt # 4-way FSDP weight shards (~8.7 GB each, ~35 GB total)
βββ optim_world_size_4_rank_{0,1,2,3}.pt # optimizer state (~17.5 GB each, not needed for inference)
βββ extra_state_world_size_4_rank_{0..3}.pt
βββ fsdp_config.json
data.pt # verl bookkeeping (not needed for inference)
Convert to HF safetensors
Use verl's model_merger.py:
git clone https://github.com/volcengine/verl
cd verl
# Download only the inference-required files (skips ~70 GB of optimizer state)
huggingface-cli download shanyangmie/physics-r1-seed17-v4-step40-fsdp \\
--include "actor/model_world_size_4_rank_*.pt" \\
--include "actor/huggingface/*" \\
--include "actor/fsdp_config.json" \\
--include "actor/extra_state_world_size_4_rank_*.pt" \\
--local-dir ./ckpt
# Merge FSDP shards into HF safetensors
python scripts/model_merger.py merge \\
--backend fsdp \\
--hf_model_path Qwen/Qwen3-VL-8B-Thinking \\
--local_dir ./ckpt/actor \\
--target_dir ./physics-r1-seed17-v4-step40-fsdp-hf
Then load with standard HF:
from transformers import AutoModelForImageTextToText, AutoProcessor
model = AutoModelForImageTextToText.from_pretrained(
"./physics-r1-seed17-v4-step40-fsdp-hf",
torch_dtype="bfloat16",
device_map="auto",
)
processor = AutoProcessor.from_pretrained("./physics-r1-seed17-v4-step40-fsdp-hf")
License
Apache 2.0, inheriting from the base model Qwen3-VL-8B-Thinking. Training data (physr1corp) is CC BY-NC 4.0, so this derivative checkpoint is intended for non-commercial research use.
Citation
@misc{yang2026physicsr1,
title = {Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning},
author = {Yang, Shan},
year = {2026},
url = {https://huggingface.co/papers/2605.14040}
}
Model tree for shanyangmie/physics-r1-seed17-v4-step40-fsdp
Base model
Qwen/Qwen3-VL-8B-Thinking