Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

RoadMa's picture

RoadMa

RoadQAQ

John6666's profile picture

·

AI & ML interests

None yet

Organizations

RoadQAQ 's collections 1

ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.

RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

Question Answering • 8B • Updated Jun 18, 2025 • 4 • 2
RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

Question Answering • 2B • Updated Jun 12, 2025 • 115
RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

Question Answering • 8B • Updated Aug 27, 2025 • 13
Elliott/Openr1-Math-46k-8192

Viewer • Updated Apr 23, 2025 • 45.8k • 279 • 9

ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.

RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

Question Answering • 8B • Updated Jun 18, 2025 • 4 • 2
RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

Question Answering • 2B • Updated Jun 12, 2025 • 115
RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

Question Answering • 8B • Updated Aug 27, 2025 • 13
Elliott/Openr1-Math-46k-8192

Viewer • Updated Apr 23, 2025 • 45.8k • 279 • 9

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs