tomg-group-umd 's Collections Retrofitting Recurrence
updated
Teaching Pretrained Language Models to Think Deeper with Retrofitted
Recurrence
Paper
• 2511.07384
• Published • 19
smcleish/Recurrent-Llama-3.2-train-recurrence-32
Text Generation
• 1B • Updated • 571
• 1
smcleish/Recurrent-Llama-3.2-train-recurrence-16
Text Generation
• 1B • Updated • 29
smcleish/Recurrent-Llama-3.2-train-recurrence-8
Text Generation
• 1B • Updated • 413
smcleish/Recurrent-Llama-3.2-train-recurrence-4
Text Generation
• 1B • Updated • 74
smcleish/Recurrent-TinyLlama-3T-train-recurrence-32
Text Generation
• 0.8B • Updated • 196
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-16
Text Generation
• 0.8B • Updated • 5
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-8
Text Generation
• 0.8B • Updated • 5
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4
Text Generation
• 0.8B • Updated • 6
smcleish/Recurrent-OLMo-2-0425-train-recurrence-32
Text Generation
• 1B • Updated • 242
• 2
smcleish/Recurrent-OLMo-2-0425-train-recurrence-16
Text Generation
• 1B • Updated • 7
smcleish/Recurrent-OLMo-2-0425-train-recurrence-8
Text Generation
• 1B • Updated • 33
smcleish/Recurrent-OLMo-2-0425-train-recurrence-4
Text Generation
• 1B • Updated • 6
• 1
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-single-phase
Text Generation
• 0.8B • Updated • 3
smcleish/Recurrent-TinyLlama-3T-train-recurrence-4-two-phase
Text Generation
• 0.8B • Updated • 7
smcleish/Recurrent-Llama-3.2-untrained
Text Generation
• 1B • Updated • 77
smcleish/Recurrent-TinyLlama-3T-untrained
Text Generation
• 0.8B • Updated • 7
smcleish/Recurrent-OLMo-2-0425-untrained
Text Generation
• 1B • Updated • 4
smcleish/Recurrent-Llama-3.2-2-4-2-untrained
Text Generation
• 1B • Updated • 8
• 1
smcleish/retrofitting-llama-fineweb-edu-tokenized
Viewer
• Updated • 332M • 390