Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Ricardo-H 's Collections
BehR: Behavior-Consistent World Models
alfworld-dual-token-0416
ws-wm-0410ministral
grpo-alfworld-0410
ws-wm-crossjudge-llama-0406
rlvr-f1-llama-textworld-f1
rlvr-f1-llama-webshop-f1
rlvr-f1
ws-wm-0314
ws-wm-f1-0314
ws-wm-llama-0227
ws-wm-0224

ws-wm-llama-0227

updated Mar 4

WebShop World Model - LLaMA3.1-8B BehR-Only GRPO checkpoints (2026-02-27)

Upvote
-

  • Ricardo-H/ws-wm-llama-0227-step-20

    8B • Updated Mar 4 • 2

  • Ricardo-H/ws-wm-llama-0227-step-40

    8B • Updated Mar 4 • 1

  • Ricardo-H/ws-wm-llama-0227-step-60

    8B • Updated Mar 4 • 2

  • Ricardo-H/ws-wm-llama-0227-step-80

    8B • Updated Mar 4 • 2

  • Ricardo-H/ws-wm-llama-0227-step-120

    8B • Updated Mar 2 • 1

  • Ricardo-H/ws-wm-llama-0227-step-200

    8B • Updated Mar 4 • 1

  • Ricardo-H/ws-wm-llama-0227-step-280

    8B • Updated Mar 4 • 4

  • Ricardo-H/ws-wm-llama-0227-step-300

    8B • Updated Mar 4 • 1

  • Ricardo-H/ws-wm-llama-0227-step-360

    8B • Updated Mar 4 • 1
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs