600M Models
Collection
600M models • 6 items • Updated
Julian-600M-40B-Instruct is an instruction-tuned language model fine-tuned from Julian-600M-40B.
| Parameter | Value |
|---|---|
| Base Model | Julian-600M-40B (39B tokens pretraining) |
| Parameters | 600M |
| Architecture | LLaMA-style (RoPE, SwiGLU, RMSNorm) |
| SFT Training | 5,000 steps on 185K instruction examples |
| Final Loss | 1.99 (PPL: 7.34) |
| Context Length | 2048 tokens |
| Languages | English (70%), French (30%) |
| Chat Format | ChatML |
from transformers import AutoModelForCausalLM, LlamaTokenizer
import torch
model_id = "JulianKrgd/julian-600m-40b-instruct-v0.1"
# IMPORTANT: Use LlamaTokenizer, not AutoTokenizer
tokenizer = LlamaTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Chat format (ChatML)
messages = [
{"role": "user", "content": "What is the capital of France?"}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=100,
temperature=0.7,
top_p=0.9,
do_sample=True,
repetition_penalty=1.1
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
<|im_start|>user
What is the capital of France?<|im_end|>
<|im_start|>assistant
The capital of France is Paris.<|im_end|>
checkpoint_300000 (39B tokens pretraining)| Model | HellaSwag | PIQA | LAMBADA |
|---|---|---|---|
| Julian-600M-10B (Base) | 45.8% | 67.6% | 35.0% |
| Julian-600M-40B (Base) | 53.5% | 66.8% | 37.3% |
| Julian-600M-10B-Instruct v0.1 | 42.7% | 66.2% | 34.6% |
| Julian-600M-40B-Instruct v0.1 | TBD | TBD | TBD |
Trained with Google TPU Research Cloud (TRC) program