-
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
Paper • 2512.24617 • Published • 66 -
Recursive Language Models
Paper • 2512.24601 • Published • 94 -
Nested Learning: The Illusion of Deep Learning Architectures
Paper • 2512.24695 • Published • 45 -
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper • 2512.02556 • Published • 265
Brian Lin
lzhbrian
AI & ML interests
None yet
Organizations
None yet
NN Arch Components
-
A Unified View of Attention and Residual Sinks: Outlier-Driven Rescaling is Essential for Transformer Training
Paper • 2601.22966 • Published -
STEM: Scaling Transformers with Embedding Modules
Paper • 2601.10639 • Published • 2 -
Deep Delta Learning
Paper • 2601.00417 • Published • 34 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 321
NN Arch
-
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
Paper • 2512.24617 • Published • 66 -
Recursive Language Models
Paper • 2512.24601 • Published • 94 -
Nested Learning: The Illusion of Deep Learning Architectures
Paper • 2512.24695 • Published • 45 -
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper • 2512.02556 • Published • 265
NN Arch Components
-
A Unified View of Attention and Residual Sinks: Outlier-Driven Rescaling is Essential for Transformer Training
Paper • 2601.22966 • Published -
STEM: Scaling Transformers with Embedding Modules
Paper • 2601.10639 • Published • 2 -
Deep Delta Learning
Paper • 2601.00417 • Published • 34 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 321
models 0
None public yet
datasets 0
None public yet