-
Attention Is All You Need
Paper • 1706.03762 • Published • 121 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2508.02324
-
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Paper • 2309.06497 • Published • 7 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 628 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 251
-
Qwen Image 2512
👀361Generate high‑quality images from detailed text prompts
-
Qwen Image Edit 2511
🏆383Edit images with AI based on your text instructions
-
Qwen Image Layered
🚀502Decompose images into editable layers and download them
-
Qwen Image Edit 2509
👀321Edit images based on natural language instructions
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 447 -
Qwen2.5-VL Technical Report
Paper • 2502.13923 • Published • 217 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 339 -
Qwen-Image Technical Report
Paper • 2508.02324 • Published • 274
-
Attention Is All You Need
Paper • 1706.03762 • Published • 121 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
-
Qwen Image 2512
👀361Generate high‑quality images from detailed text prompts
-
Qwen Image Edit 2511
🏆383Edit images with AI based on your text instructions
-
Qwen Image Layered
🚀502Decompose images into editable layers and download them
-
Qwen Image Edit 2509
👀321Edit images based on natural language instructions
-
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Paper • 2309.06497 • Published • 7 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 628 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 251
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 447 -
Qwen2.5-VL Technical Report
Paper • 2502.13923 • Published • 217 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 339 -
Qwen-Image Technical Report
Paper • 2508.02324 • Published • 274