Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Ricardo-H
's Collections
BehR: Behavior-Consistent World Models
alfworld-dual-token-0416
ws-wm-0410ministral
grpo-alfworld-0410
ws-wm-crossjudge-llama-0406
rlvr-f1-llama-textworld-f1
rlvr-f1-llama-webshop-f1
rlvr-f1
ws-wm-0314
ws-wm-f1-0314
ws-wm-llama-0227
ws-wm-0224
rlvr-f1
updated
27 days ago
RLVR-World style Token F1 reward ablation models for BehR-WM rebuttal experiments
Upvote
-
Ricardo-H/WorldModel-Textworld-F1Reward-Qwen2.5-7B-step100
8B
•
Updated
Mar 22
•
24
Ricardo-H/ws-wm-llama-webshop-f1-step-50
8B
•
Updated
about 1 month ago
•
3
Ricardo-H/ws-wm-llama-webshop-f1-step-92
8B
•
Updated
about 1 month ago
•
21
Ricardo-H/ws-wm-llama-textworld-f1-step-100
8B
•
Updated
27 days ago
•
26
Ricardo-H/ws-wm-0314-step-100
8B
•
Updated
Mar 15
•
30
Ricardo-H/ws-wm-llama-textworld-f1-step-150
8B
•
Updated
27 days ago
•
7
Ricardo-H/ws-wm-llama-textworld-f1-step-171
8B
•
Updated
27 days ago
•
23
Upvote
-
Share collection
View history
Collection guide
Browse collections