llama32-1b-gsm-coconut-checkpoint7

CoCoNut (Chain of Continuous Thought) trained checkpoint 7 based on Llama-3.2-1B.

Training

  • Base model: meta-llama/Llama-3.2-1B
  • Training method: CoCoNut (continuous latent reasoning)
  • Dataset: GSM8K
  • Checkpoint: 7

Usage

import torch
checkpoint = torch.load("pytorch_model.bin", map_location='cpu')
# This is a raw PyTorch checkpoint (state dict with 'base_causallm.' prefix)

Model Details

This checkpoint contains the model weights from intermediate training steps. The state dict keys have the prefix base_causallm.model.*.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Onlydrinkwater/llama32-1b-gsm-coconut-checkpoint7

Finetuned
(903)
this model