Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.17195

ShowAndTell-2025-01-30

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 82
Optimizing Large Language Model Training Using FP4 Quantization

Paper • 2501.17116 • Published Jan 28, 2025 • 36
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 145

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Paper • 2504.21117 • Published Apr 29, 2025 • 26

SLM e Moe structure PHD tesis: SOTA e valutazione parametri

collezione di paper utili per redazione tesi 1-2-3- capitolo da valutare cambio di rotta e gestione PHD

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1, 2025 • 110
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2, 2025 • 51
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2, 2025 • 44
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Paper • 2411.13552 • Published Nov 20, 2024

VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation

Paper • 2412.10704 • Published Dec 14, 2024 • 16
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 53
Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 10

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35

AtlaAI/Selene-1-Mini-Llama-3.1-8B

Text Generation • 8B • Updated Jul 25, 2025 • 952 • • 103
Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35
Running

8

Selene 1 Mini Tech Report

🧠

8

Selene 1 Mini: Technical Report
AtlaAI/Selene-1-Mini-Llama-3.1-8B-GPTQ-W8A8

Text Generation • 8B • Updated Jul 25, 2025 • 4 • 2

Reasoning, Thinking, RL and Test-Time Scaling

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 39
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 36
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 47

Roleplay Related

RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

Paper • 2412.11919 • Published Dec 16, 2024 • 36
Runtime error

Agents

205

Flux Outpainting

👈

205

Extend your image with advanced outpainting
Running on Zero

Agents

Featured

3.29k

Kokoro TTS

❤

3.29k

Upgraded to v1.0!
fancyfeast/joytag

Image Classification • Updated Mar 9, 2024 • 821 • 114

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 60
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 53
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 64

ShowAndTell-2025-01-30

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35
DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 82
Optimizing Large Language Model Training Using FP4 Quantization

Paper • 2501.17116 • Published Jan 28, 2025 • 36
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 145

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35

Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Paper • 2504.21117 • Published Apr 29, 2025 • 26

AtlaAI/Selene-1-Mini-Llama-3.1-8B

Text Generation • 8B • Updated Jul 25, 2025 • 952 • • 103
Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35
Running

8

Selene 1 Mini Tech Report

🧠

8

Selene 1 Mini: Technical Report
AtlaAI/Selene-1-Mini-Llama-3.1-8B-GPTQ-W8A8

Text Generation • 8B • Updated Jul 25, 2025 • 4 • 2

SLM e Moe structure PHD tesis: SOTA e valutazione parametri

collezione di paper utili per redazione tesi 1-2-3- capitolo da valutare cambio di rotta e gestione PHD

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1, 2025 • 110
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2, 2025 • 51
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2, 2025 • 44
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Paper • 2411.13552 • Published Nov 20, 2024

Reasoning, Thinking, RL and Test-Time Scaling

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 39
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 36
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 47

VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation

Paper • 2412.10704 • Published Dec 14, 2024 • 16
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 53
Atla Selene Mini: A General Purpose Evaluation Model

Paper • 2501.17195 • Published Jan 27, 2025 • 35

Roleplay Related

RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

Paper • 2412.11919 • Published Dec 16, 2024 • 36
Runtime error

Agents

205

Flux Outpainting

👈

205

Extend your image with advanced outpainting
Running on Zero

Agents

Featured

3.29k

Kokoro TTS

❤

3.29k

Upgraded to v1.0!
fancyfeast/joytag

Image Classification • Updated Mar 9, 2024 • 821 • 114

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Paper • 2412.11605 • Published Dec 16, 2024 • 18
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval

Paper • 2412.15443 • Published Dec 19, 2024 • 10

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 60
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 53
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 64

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs