Watch Before You Answer: Learning from Visually Grounded Post-Training Paper • 2604.05117 • Published 12 days ago • 35
APEX Quants (GGUF) Collection MoE models quantized with the APEX Quantization technique ( https://github.com/mudler/apex-quant ) • 24 items • Updated about 18 hours ago • 50
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 10 days ago • 55
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models Paper • 2603.26164 • Published 22 days ago • 353
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate Paper • 2504.19874 • Published Apr 28, 2025 • 32
MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation Paper • 2603.29029 • Published 18 days ago • 13
Learn2Fold: Structured Origami Generation with World Model Planning Paper • 2603.29585 • Published Feb 2 • 16
LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels Paper • 2603.19312 • Published Mar 13 • 28
DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models Paper • 2603.23499 • Published 24 days ago • 51
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published 22 days ago • 154
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 94
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation Paper • 2603.22117 • Published 25 days ago • 29
WiT: Waypoint Diffusion Transformers via Trajectory Conflict Navigation Paper • 2603.15132 • Published Mar 16 • 35
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper • 2603.13398 • Published Mar 11 • 153
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published Mar 12 • 91
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards Paper • 2603.09117 • Published Mar 10 • 10
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Paper • 2603.09095 • Published Mar 10 • 29