Text Generation
PEFT
Safetensors
English
reinforcement-learning
grpo
lora
openenv
multi-agent
scalable-oversight
chaosops
conversational
Instructions to use helloAK96/chaosops-grpo-lora-p2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use helloAK96/chaosops-grpo-lora-p2 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct") model = PeftModel.from_pretrained(base_model, "helloAK96/chaosops-grpo-lora-p2") - Notebooks
- Google Colab
- Kaggle