llama32-1b-gsm-coconut-checkpoint15

CoCoNut (Chain of Continuous Thought) trained checkpoint 15 based on Llama-3.2-1B.

Training

Base model: meta-llama/Llama-3.2-1B
Training method: CoCoNut (continuous latent reasoning)
Dataset: GSM8K
Checkpoint: 15

Usage

import torch
checkpoint = torch.load("pytorch_model.bin", map_location='cpu')
# This is a raw PyTorch checkpoint (state dict with 'base_causallm.' prefix)

Model Details

This checkpoint contains the model weights from intermediate training steps. The state dict keys have the prefix base_causallm.model.*.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Onlydrinkwater/llama32-1b-gsm-coconut-checkpoint15

Base model

meta-llama/Llama-3.2-1B

Finetuned

(903)

this model