-
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
Paper β’ 2511.22570 β’ Published β’ 93 -
DeepSeek-OCR: Contexts Optical Compression
Paper β’ 2510.18234 β’ Published β’ 93 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper β’ 2501.12948 β’ Published β’ 447 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper β’ 2505.09343 β’ Published β’ 76
Collections
Discover the best community collections!
Collections including paper arxiv:2406.11931
-
Phi-4 Technical Report
Paper β’ 2412.08905 β’ Published β’ 123 -
Evaluating and Aligning CodeLLMs on Human Preference
Paper β’ 2412.05210 β’ Published β’ 48 -
Evaluating Language Models as Synthetic Data Generators
Paper β’ 2412.03679 β’ Published β’ 47 -
Yi-Lightning Technical Report
Paper β’ 2412.01253 β’ Published β’ 28
-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper β’ 2310.16818 β’ Published β’ 33 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper β’ 2401.02954 β’ Published β’ 55 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper β’ 2401.06066 β’ Published β’ 61 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 72
-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper β’ 2310.16818 β’ Published β’ 33 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper β’ 2401.02954 β’ Published β’ 55 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper β’ 2401.06066 β’ Published β’ 61 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 72
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper β’ 2401.02954 β’ Published β’ 55 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper β’ 2401.06066 β’ Published β’ 61 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 72 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper β’ 2402.03300 β’ Published β’ 145
-
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper β’ 2405.14333 β’ Published β’ 45 -
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper β’ 2408.08152 β’ Published β’ 62 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper β’ 2501.12948 β’ Published β’ 447 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper β’ 2402.03300 β’ Published β’ 145
-
infly/OpenCoder-8B-Instruct
Text Generation β’ Updated β’ 1.56k β’ 202 -
infly/OpenCoder-8B-Base
Text Generation β’ 8B β’ Updated β’ 2.07k β’ 31 -
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Paper β’ 2406.11931 β’ Published β’ 69 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 72
-
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
Paper β’ 2511.22570 β’ Published β’ 93 -
DeepSeek-OCR: Contexts Optical Compression
Paper β’ 2510.18234 β’ Published β’ 93 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper β’ 2501.12948 β’ Published β’ 447 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper β’ 2505.09343 β’ Published β’ 76
-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper β’ 2310.16818 β’ Published β’ 33 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper β’ 2401.02954 β’ Published β’ 55 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper β’ 2401.06066 β’ Published β’ 61 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 72
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper β’ 2401.02954 β’ Published β’ 55 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper β’ 2401.06066 β’ Published β’ 61 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 72 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper β’ 2402.03300 β’ Published β’ 145
-
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper β’ 2405.14333 β’ Published β’ 45 -
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper β’ 2408.08152 β’ Published β’ 62 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper β’ 2501.12948 β’ Published β’ 447 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper β’ 2402.03300 β’ Published β’ 145
-
Phi-4 Technical Report
Paper β’ 2412.08905 β’ Published β’ 123 -
Evaluating and Aligning CodeLLMs on Human Preference
Paper β’ 2412.05210 β’ Published β’ 48 -
Evaluating Language Models as Synthetic Data Generators
Paper β’ 2412.03679 β’ Published β’ 47 -
Yi-Lightning Technical Report
Paper β’ 2412.01253 β’ Published β’ 28
-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper β’ 2310.16818 β’ Published β’ 33 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper β’ 2401.02954 β’ Published β’ 55 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper β’ 2401.06066 β’ Published β’ 61 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 72
-
infly/OpenCoder-8B-Instruct
Text Generation β’ Updated β’ 1.56k β’ 202 -
infly/OpenCoder-8B-Base
Text Generation β’ 8B β’ Updated β’ 2.07k β’ 31 -
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Paper β’ 2406.11931 β’ Published β’ 69 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper β’ 2401.14196 β’ Published β’ 72