Mid-training Analysis Checkpoints (Llama-3.2-3B)
Collection
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training. • 10 items • Updated • 1
No model card