Running on A10G Featured 211 faster-qwen3-tts 🎙 211 Generate speech audio from text with custom or cloned voices
Running Featured 358 Kokoro Text-to-Speech (WebGPU) đź—Ł 358 High-quality speech synthesis powered by Kokoro TTS
Running Featured 396 Qwen3 TTS Demo 🚀 396 Generate spoken audio from text in many voices and languages
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25, 2025 • 217
Persona Vectors: Monitoring and Controlling Character Traits in Language Models Paper • 2507.21509 • Published Jul 29, 2025 • 33
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models Paper • 2507.13344 • Published Jul 17, 2025 • 59
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels Paper • 2507.21809 • Published Jul 29, 2025 • 142
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published Jul 30, 2025 • 101
DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation Paper • 2506.06251 • Published Jun 6, 2025 • 2
Running on CPU Upgrade 10k Kolors Virtual Try-On 👕 10k Generate a virtual try‑on image of a person wearing a garment
Running on CPU Upgrade 599 GAIA Leaderboard 🦾 599 Submit your model answers to GAIA benchmark and view leaderboard