tomg-group-umd 's Collections Retrofitting Recurrence
updated
Teaching Pretrained Language Models to Think Deeper with Retrofitted
Recurrence
Paper
• 2511.07384
• Published • 19
smcleish/Recurrent-Llama-3.2-train-recurrence-32
Text Generation
• 1B • Updated • 527
• 1
smcleish/Recurrent-Llama-3.2-train-recurrence-16
Text Generation
• 1B • Updated • 36
smcleish/Recurrent-Llama-3.2-train-recurrence-8
Text Generation
• 1B • Updated • 469
smcleish/Recurrent-Llama-3.2-train-recurrence-4
Text Generation
• 1B • Updated • 69
smcleish/Recurrent-TinyLlama-3T-train-recurrence-32
Text Generation
• 0.8B • Updated • 169
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-16
Text Generation
• 0.8B • Updated • 3
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-8
Text Generation
• 0.8B • Updated • 5
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4
Text Generation
• 0.8B • Updated • 4
smcleish/Recurrent-OLMo-2-0425-train-recurrence-32
Text Generation
• 1B • Updated • 215
• 2
smcleish/Recurrent-OLMo-2-0425-train-recurrence-16
Text Generation
• 1B • Updated • 7
smcleish/Recurrent-OLMo-2-0425-train-recurrence-8
Text Generation
• 1B • Updated • 33
smcleish/Recurrent-OLMo-2-0425-train-recurrence-4
Text Generation
• 1B • Updated • 5
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-single-phase
Text Generation
• 0.8B • Updated • 3
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-two-phase
Text Generation
• 0.8B • Updated • 7
smcleish/Recurrent-Llama-3.2-untrained
Text Generation
• 1B • Updated • 70
smcleish/Recurrent-TinyLlama-3T-untrained
Text Generation
• 0.8B • Updated • 7
smcleish/Recurrent-OLMo-2-0425-untrained
Text Generation
• 1B • Updated • 4
smcleish/Recurrent-Llama-3.2-2-4-2-untrained
Text Generation
• 1B • Updated • 8
• 1
smcleish/retrofitting-llama-fineweb-edu-tokenized
Viewer
• Updated • 332M • 384