LinAlgZero-GRPO / README.md

atomwalk12

Update README.md

73c19b2 verified about 2 months ago

preview code

raw

history blame contribute delete

668 Bytes

metadata

base_model: atomwalk12/LinalgZero-SFT
library_name: peft
pipeline_tag: text-generation
tags:
  - base_model:adapter:atomwalk12/LinalgZero-SFT
  - grpo
  - lora
  - transformers
  - trl
  - unsloth
  - step1000

Model Card for LinalgZero-GSPO

Information and code used to train this model is available on Github.

This model is a fine-tuned version of atomwalk12/LinalgZero-SFT on the atomwalk12/linalgzero-grpo dataset using the GSPO algorithm. It has been trained using ART.