Scaling Low-Resource MT via Synthetic Data Generation with LLMs Paper • 2505.14423 • Published May 20, 2025 • 2
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024 Paper • 2406.16777 • Published Jun 24, 2024 • 1
A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn't) Paper • 2602.14696 • Published Feb 16
A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn't) Paper • 2602.14696 • Published Feb 16
SciLaD: A Large-Scale, Transparent, Reproducible Dataset for Natural Scientific Language Processing Paper • 2512.11192 • Published Dec 12, 2025
Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers Paper • 2601.04890 • Published Jan 8 • 44
Revisiting Generalization Across Difficulty Levels: It's Not So Easy Paper • 2511.21692 • Published Nov 26, 2025 • 15
view post Post 458 PatchDNA, a DNA foundation model based on Meta's BLT tokenization strategy https://www.biorxiv.org/content/10.1101/2025.11.28.691095v1 See translation 🚀 1 1 + Reply
Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures Paper • 2510.24081 • Published Oct 28, 2025 • 21
Boomerang Distillation Enables Zero-Shot Model Size Interpolation Paper • 2510.05064 • Published Oct 6, 2025 • 1