euclaise

https://euclaise.xyz

euclaise

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

Rta-AILabs/Nandi-Mini-150M

liked a model 13 days ago

google/gemma-4-31B

liked a model 13 days ago

google/gemma-4-E4B-it

View all activity

Organizations

upvoted 6 papers 28 days ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 191

Attention Residuals

Paper • 2603.15031 • Published about 1 month ago • 180

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates

Paper • 2502.06772 • Published Feb 10, 2025 • 22

upvoted 4 papers about 1 month ago

RAT: Bridging RNN Efficiency and Attention Accuracy in Language Modeling

Paper • 2507.04416 • Published Jul 6, 2025 • 1

RAT+: Train Dense, Infer Sparse -- Recurrence Augmented Attention for Dilated Inference

Paper • 2602.18196 • Published Feb 20 • 1

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 58

Lost in Backpropagation: The LM Head is a Gradient Bottleneck

Paper • 2603.10145 • Published Mar 10 • 13

upvoted 6 papers about 2 months ago

Online Vector Quantized Attention

Paper • 2602.03922 • Published Feb 3 • 1

Softmax Linear Attention: Reclaiming Global Competition

Paper • 2602.01744 • Published Feb 2 • 1

Test-Time Training with KV Binding Is Secretly Linear Attention

Paper • 2602.21204 • Published Feb 24 • 31

On the "Induction Bias" in Sequence Models

Paper • 2602.18333 • Published Feb 20 • 4

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

Paper • 2602.21196 • Published Feb 24 • 7

One-step Language Modeling via Continuous Denoising

Paper • 2602.16813 • Published Feb 18 • 4

upvoted an article about 2 months ago

Article

Differential Transformer V2

Jan 20

•

upvoted 3 papers about 2 months ago

2Mamba2Furious: Linear in Complexity, Competitive in Accuracy

Paper • 2602.17363 • Published Feb 19 • 8

Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

Paper • 2602.13367 • Published Feb 13 • 35

On Surprising Effectiveness of Masking Updates in Adaptive Optimizers

Paper • 2602.15322 • Published Feb 17 • 10

euclaise

AI & ML interests

Recent Activity

Organizations

euclaise's activity

Differential Transformer V2