Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2506.09344

DVPS Scientific Watch

Collection of external scientific material relevant to the project

about 1 hour ago

HuggingFaceFW/finetranslations

Viewer • Updated Jan 9 • 3.33B • 132k • 283
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators

Paper • 2411.00136 • Published Oct 31, 2024
The Illusion of Readiness in Health AI

Paper • 2509.18234 • Published Sep 22, 2025 • 1
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?

Paper • 2601.07220 • Published Jan 12

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5, 2025 • 81
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Paper • 2506.04207 • Published Jun 4, 2025 • 48
MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4, 2025 • 80
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Paper • 2506.03147 • Published Jun 3, 2025 • 58

stuff i never have time to read

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17, 2025 • 97
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks

Paper • 2402.11984 • Published Feb 19, 2024
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling

Paper • 2503.06121 • Published Mar 8, 2025 • 5
Timer: Transformers for Time Series Analysis at Scale

Paper • 2402.02368 • Published Feb 4, 2024 • 2

Image-Video MultiModal Understanding

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 147
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization

Paper • 2501.01245 • Published Jan 2, 2025 • 5
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 46
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14, 2025 • 34

Running on Zero

Agents

516

Finegrain Object Cutter

✂

516

Create HD cutouts from any image with just a prompt
Running on Zero

MCP

2.81k

Background Removal

🌘

2.81k

Remove image backgrounds and get transparent PNGs
Running on CPU Upgrade

Agents

10k

Kolors Virtual Try-On

👕

10k

Generate a virtual try‑on image of a person wearing a garment
MyTimeMachine: Personalized Facial Age Transformation

Paper • 2411.14521 • Published Nov 21, 2024 • 23

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Paper • 2506.09344 • Published Jun 11, 2025 • 31

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Paper • 2506.09344 • Published Jun 11, 2025 • 31
inclusionAI/Ming-Lite-Omni

Any-to-Any • 19B • Updated Oct 27, 2025 • 31 • 198
inclusionAI/Ming-Lite-Omni-1.5

Any-to-Any • Updated Aug 29, 2025 • 248 • 85
inclusionAI/Ming-UniAudio-16B-A3B

Any-to-Any • 18B • Updated Nov 24, 2025 • 69 • 78

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10, 2025 • 44
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14, 2025 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14, 2025 • 85

Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline

Paper • 2411.12814 • Published Nov 19, 2024 • 23
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation

Paper • 2411.14525 • Published Nov 21, 2024 • 19
MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities

Paper • 2412.04106 • Published Dec 4, 2024 • 5
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion

Paper • 2412.17780 • Published Dec 23, 2024 • 5

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 30
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

DVPS Scientific Watch

Collection of external scientific material relevant to the project

about 1 hour ago

HuggingFaceFW/finetranslations

Viewer • Updated Jan 9 • 3.33B • 132k • 283
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators

Paper • 2411.00136 • Published Oct 31, 2024
The Illusion of Readiness in Health AI

Paper • 2509.18234 • Published Sep 22, 2025 • 1
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?

Paper • 2601.07220 • Published Jan 12

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Paper • 2506.09344 • Published Jun 11, 2025 • 31

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5, 2025 • 81
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Paper • 2506.04207 • Published Jun 4, 2025 • 48
MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4, 2025 • 80
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Paper • 2506.03147 • Published Jun 3, 2025 • 58

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Paper • 2506.09344 • Published Jun 11, 2025 • 31
inclusionAI/Ming-Lite-Omni

Any-to-Any • 19B • Updated Oct 27, 2025 • 31 • 198
inclusionAI/Ming-Lite-Omni-1.5

Any-to-Any • Updated Aug 29, 2025 • 248 • 85
inclusionAI/Ming-UniAudio-16B-A3B

Any-to-Any • 18B • Updated Nov 24, 2025 • 69 • 78

stuff i never have time to read

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17, 2025 • 97
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks

Paper • 2402.11984 • Published Feb 19, 2024
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling

Paper • 2503.06121 • Published Mar 8, 2025 • 5
Timer: Transformers for Time Series Analysis at Scale

Paper • 2402.02368 • Published Feb 4, 2024 • 2

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10, 2025 • 44
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14, 2025 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14, 2025 • 85

Image-Video MultiModal Understanding

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 147
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization

Paper • 2501.01245 • Published Jan 2, 2025 • 5
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 46
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14, 2025 • 34

Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline

Paper • 2411.12814 • Published Nov 19, 2024 • 23
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation

Paper • 2411.14525 • Published Nov 21, 2024 • 19
MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities

Paper • 2412.04106 • Published Dec 4, 2024 • 5
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion

Paper • 2412.17780 • Published Dec 23, 2024 • 5

Running on Zero

Agents

516

Finegrain Object Cutter

✂

516

Create HD cutouts from any image with just a prompt
Running on Zero

MCP

2.81k

Background Removal

🌘

2.81k

Remove image backgrounds and get transparent PNGs
Running on CPU Upgrade

Agents

10k

Kolors Virtual Try-On

👕

10k

Generate a virtual try‑on image of a person wearing a garment
MyTimeMachine: Personalized Facial Age Transformation

Paper • 2411.14521 • Published Nov 21, 2024 • 23

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 30
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 15
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs