3 20 41

Susant Achary

Susant-Achary

leonsarmiento's profile picture

exoplanet's profile picture

akshar189's profile picture

https://huggingface.co/Susant-Achary

SSusantAchary

AI & ML interests

Tiny to Small Language Models, Building from India. Quantization and MLX

Recent Activity

reacted to Shrijanagain's post with ➕ 27 days ago

Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training Author: SKT AI LABS Affiliation: SKT AI Labs / Project Surya Model Architecture: Optimized Dense Transformer Parameters: 1.1 Trillion Training Tokens: 146 Trillion Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull Whitepaper - https://github.com/SHRIJANAGAIN/PROFF

reacted to Shrijanagain's post with 🔥 27 days ago

liked a model 5 months ago

mlx-community/medgemma-27b-it-8bit

View all activity

Organizations

Susant-Achary 's collections 28

Vision-LM

meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 209k • 1.58k
mlx-community/dolphin-vision-72b-4bit

Image-Text-to-Text • Updated Jul 4, 2024 • 839 • 7
mlx-community/Phi-3.5-vision-instruct-4bit

Text Generation • Updated Apr 19, 2025 • 246 • 6

<7B Best of MoE 🧠

Collection of Small size big impact MoE.

LiquidAI/LFM2-8B-A1B

Text Generation • 8B • Updated 18 days ago • 64.8k • 351
ibm-granite/granite-4.0-h-tiny

Text Generation • 7B • Updated Nov 3, 2025 • 81.9k • 198
microsoft/Phi-4-multimodal-instruct

Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 329k • 1.58k
google/gemma-3n-E4B-it

Image-Text-to-Text • Updated Jul 14, 2025 • 39.5k • • 900

Audio Features

laion/clap-htsat-fused

Audio Classification • 0.2B • Updated Jan 12 • 18.1M • 77

Feature Extraction with 🧠 Text Embeddings

models for turning text, images, audio (and combos) into useful vectors or feature maps. Ideal for search/RAG, clustering, recommendation, retrieval.

BAAI/bge-base-en-v1.5

Feature Extraction • 0.1B • Updated Feb 21, 2024 • 6.55M • • 411
BAAI/bge-m3

Sentence Similarity • Updated Jul 3, 2024 • 15.6M • • 2.91k
facebook/bart-base

Feature Extraction • 0.1B • Updated Nov 16, 2022 • 503k • • 204
sentence-transformers/all-MiniLM-L12-v2

Sentence Similarity • 33.4M • Updated 16 days ago • 3.14M • • 302

🪶 Sept’25 <Text Generation Language Models >(Top Releases)

coding models and pipelines released this month that boost repo-level reasoning, GUI automation, and tool use. Focused on practical editing.

deepseek-ai/DeepSeek-V3.1

Text Generation • Updated Sep 5, 2025 • 151k • • 819
deepseek-ai/DeepSeek-V3.2-Exp

Text Generation • Updated Nov 18, 2025 • 191k • • 983
mistralai/Magistral-Small-2509

24B • Updated Feb 23 • 17.6k • 300
openbmb/MiniCPM4.1-8B

Text Generation • Updated Oct 24, 2025 • 24.3k • 387

🖼️ **Text2Image, i2i ** September ’25 (Top Releases)

Cutting-edge image generation & VLM updates from September ’25. This collection spotlights models that improved text rendering, layout control & more.

tencent/HunyuanImage-3.0

Text-to-Image • Updated Jan 28 • 18.9k • • 663
Qwen/Qwen-Image-Edit-2509

Image-to-Image • Updated Sep 22, 2025 • 243k • • 1.1k
Qwen/Qwen3-VL-235B-A22B-Thinking

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 322k • • 388

📄➡️🔊 Text-to-Speech (TTS)

Speech synthesis models that turn text into natural audio. Includes multilingual TTS, low-latency real-time models, and voice-cloning variants.

coqui/XTTS-v2

Text-to-Speech • Updated Dec 11, 2023 • 6.49M • 3.48k
hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10, 2025 • 10.1M • • 6.01k
ResembleAI/chatterbox

Text-to-Speech • Updated Sep 23, 2025 • 1.57M • • 1.55k
SWivid/F5-TTS

Text-to-Speech • Updated Mar 21, 2025 • 669k • 1.16k

📚➡️🎨Text-to-Image

State-of-the-art diffusion and generative models that turn text prompts into detailed images. Includes lightweight CPU-friendly and photorealistic mdl

stable-diffusion-v1-5/stable-diffusion-v1-5

Text-to-Image • Updated Sep 7, 2024 • 1.55M • 1.08k
stabilityai/stable-diffusion-xl-base-1.0

Text-to-Image • Updated Oct 30, 2023 • 1.96M • • 7.62k
stabilityai/sd-turbo

Text-to-Image • Updated Jul 10, 2024 • 657k • 447
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27, 2025 • 688k • • 12.7k

🎨➡️✍️ Image-to-Text

OCR, captioning, and visual QA models that turn pure images into descriptive or structured text.

Salesforce/blip-image-captioning-base

Image-to-Text • Updated Feb 3, 2025 • 2.22M • 847
Salesforce/blip-image-captioning-large

Image-to-Text • 0.5B • Updated Feb 3, 2025 • 1.41M • 1.47k
nlpconnect/vit-gpt2-image-captioning

Image-to-Text • Updated Feb 27, 2023 • 214k • 927
microsoft/trocr-base-handwritten

Image-to-Text • 0.3B • Updated Feb 11, 2025 • 153k • 490

🌀 Any-to-Any Multimodal Models

Models that can flexibly convert across modalities (text, image, audio, video). Ideal for researchers exploring unified multimodal-AI.

Qwen/Qwen2.5-Omni-3B

Any-to-Any • Updated Apr 30, 2025 • 460k • 332
Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated Apr 30, 2025 • 474k • 1.89k
deepseek-ai/Janus-Pro-1B

Any-to-Any • Updated Feb 1, 2025 • 15.3k • 474
openbmb/MiniCPM-o-2_6

Any-to-Any • 9B • Updated Oct 5, 2025 • 116k • 1.29k

👨‍💻Mathematical Reasoning 🧮

Datasets tackling AI Toughest Challenges

nvidia/OpenMathInstruct-2

Viewer • Updated Nov 25, 2024 • 22M • 22.6k • 236
AI-MO/NuminaMath-CoT

Viewer • Updated Nov 25, 2024 • 860k • 37.1k • 562
meta-math/MetaMathQA

Viewer • Updated Dec 21, 2023 • 395k • 37.4k • 453

🧩 Long-Context Models (≥128k) CODING

10 CODING models that support ≥128k context (native or via officially documented scaling)

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 9.55M • • 5.7k
google/gemma-3-4b-it

Image-Text-to-Text • Updated Mar 21, 2025 • 1.76M • 1.31k
Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.83M • • 1.01k
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

Text Generation • 16B • Updated Jul 3, 2024 • 438k • • 581

🧩 Long-Context Models (≥128k) under 8B

microsoft/Phi-3-mini-128k-instruct

Text Generation • Updated Dec 10, 2025 • 251k • 1.7k
microsoft/Phi-3-vision-128k-instruct

Text Generation • Updated Dec 10, 2025 • 96.7k • 970
Menlo/Jan-nano-128k-gguf

Text Generation • 4B • Updated Jul 1, 2025 • 9.17k • 71
unsloth/SmolLM3-3B-128K-GGUF

3B • Updated Jul 8, 2025 • 3.14k • 41

Qwen3

Best of Qwen3 Series of Models

Qwen/Qwen3-30B-A3B-Instruct-2507

Text Generation • Updated Sep 17, 2025 • 1.04M • • 799
Qwen/Qwen3-Next-80B-A3B-Thinking

Text Generation • Updated Sep 15, 2025 • 34.3k • • 487
Qwen/Qwen3-Coder-30B-A3B-Instruct

Text Generation • 31B • Updated Dec 3, 2025 • 1.83M • • 1.01k
Qwen/Qwen3-Omni-30B-A3B-Instruct

Any-to-Any • 35B • Updated Sep 22, 2025 • 369k • 907

🛩️Qwen3-VL

the most powerful vision-language model in the Qwen series to date. Available in Dense and MoE architectures

Qwen/Qwen3-VL-30B-A3B-Thinking

Image-Text-to-Text • 31B • Updated Nov 26, 2025 • 88k • • 197
mlx-community/Qwen3-VL-30B-A3B-Instruct-4bit

Image-Text-to-Text • Updated Oct 11, 2025 • 355 • 7
mlx-community/Qwen3-VL-30B-A3B-Instruct-8bit

Image-Text-to-Text • Updated Oct 11, 2025 • 100 • 3
mlx-community/Qwen3-VL-8B-Instruct-4bit

Image-Text-to-Text • Updated Oct 14, 2025 • 1.97k • 5

🍎 MLX-Quantized Models (3/4/5/6-bit) Mac & iOS

Curated MLX-ready quantized LLMs that run fast on Apple Silicon (and some on iOS). Every card lists Bits · Group size · Peak UM (GB) · Stable context.

mlx-community/Apriel-1.5-15b-Thinker-3bit-MLX

Image-Text-to-Text • Updated Oct 3, 2025 • 7
mlx-community/Apriel-1.5-15b-Thinker-6bit-MLX

Image-Text-to-Text • Updated Oct 3, 2025 • 39 • 1
mlx-community/granite-4.0-h-tiny-3bit-MLX

Text Generation • 0.9B • Updated Oct 3, 2025 • 182 • 2
mlx-community/granite-4.0-tiny-preview-4bit

Text Generation • Updated Sep 10, 2025 • 10

🖼️ Vision Backbones & Image Embeddings

facebook/dinov2-base

Image Feature Extraction • 86.6M • Updated Jan 17, 2024 • 1.52M • 176
openai/clip-vit-large-patch14-336

Zero-Shot Image Classification • Updated Oct 4, 2022 • 16.7M • 301
google/siglip-so400m-patch14-384

Zero-Shot Image Classification • 0.9B • Updated Sep 26, 2024 • 2.14M • 669
BAAI/EVA-CLIP-8B

Feature Extraction • Updated Feb 7, 2024 • 1.26k • 50

🧊Sept 25 <Image-to-3D> [Top Releases]

Models that turn a single image (or image+prompt) into 3D assets meshes, Gaussians, or point clouds suited for AR/VR, product turntables, game props.

tencent/Hunyuan3D-Omni

Image-to-3D • Updated Oct 17, 2025 • 1.22k • 164
tencent/Hunyuan3D-Part

Updated Oct 17, 2025 • 1.41k • 180
facebook/VGGT-1B-Commercial

Image-to-3D • Updated Sep 17, 2025 • 1.6k • 56
Stable-X/vggt-object-v0-1

Image-to-3D • Updated Sep 9, 2025 • 3.56k • 8

🎬 ✍️ Sept 25 <Video & Text2Video> (Top Releases)

open T2V & animation models emphasizing temporal coherence, controllability, and real-time playback. Great starting point for creative tools, Ads.

kandinskylab/Kandinsky-5.0-T2V-Lite-pretrain-5s

Updated Nov 20, 2025 • 15 • 10
Efficient-Large-Model/LongLive-1.3B

Updated Sep 29, 2025 • 42
Wan-AI/Wan2.2-Animate-14B

Video-to-Video • Updated Nov 5, 2025 • 23.6k • 1.13k
bytedance-research/HuMo

Image-to-Video • Updated Sep 18, 2025 • 85 • 216

Top Apache 2.0 License

Free and Open Source provided you don't source model and claim right

openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.82M • • 5.58k
facebook/wav2vec2-base-960h

Automatic Speech Recognition • 94.4M • Updated Nov 14, 2022 • 1.24M • 395
openai/whisper-small

Automatic Speech Recognition • Updated Feb 29, 2024 • 1.97M • 548
openai/whisper-tiny

Automatic Speech Recognition • Updated Feb 29, 2024 • 763k • 425

✍️➡️🎬 Text-to-Video

Models that create short videos from written prompts. Perfect for experimentation in generative video and creative storytelling.

Wan-AI/Wan2.2-I2V-A14B-Diffusers

Image-to-Video • Updated Aug 9, 2025 • 95.7k • • 221
Wan-AI/Wan2.1-T2V-1.3B-Diffusers

Text-to-Video • Updated Apr 4, 2025 • 93.2k • 116
ByteDance/AnimateDiff-Lightning

Text-to-Video • Updated Jan 6, 2025 • 10k • 977
genmo/mochi-1-preview

Text-to-Video • Updated Sep 4, 2025 • 6.27k • • 1.32k

🖌️ Image-to-Image

Image editing and transformation models :- from style transfer to super-resolution, inpainting, and diffusion-based edits.

stabilityai/stable-diffusion-xl-refiner-1.0

Image-to-Image • Updated Sep 25, 2023 • 245k • 2.03k
black-forest-labs/FLUX.1-Kontext-dev

Image-to-Image • Updated Jan 1 • 54.6k • • 2.59k
Qwen/Qwen-Image-Edit

Image-to-Image • Updated Aug 25, 2025 • 76.2k • • 2.37k
lllyasviel/sd-controlnet-canny

Image-to-Image • Updated May 1, 2023 • 24.4k • 244

🖼️➡️📚 Image-Text-to-Text

Multimodal models that take image + text as input and produce natural language output. Use cases: chart QA, visual document reasoning, VQA.

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 4.52M • • 1.49k
Qwen/Qwen2.5-VL-3B-Instruct

Image-Text-to-Text • 4B • Updated Apr 6, 2025 • 6.31M • 634
google/gemma-3-4b-it

Image-Text-to-Text • Updated Mar 21, 2025 • 1.76M • 1.31k
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1

Image-Text-to-Text • Updated Dec 4, 2025 • 998k • 177

✍️ Text Generation

Collection of top open LLMs for writing, summarization, chat, reasoning, and document drafting. Includes small SLMs for devices and large models .

openai-community/gpt2

Text Generation • 0.1B • Updated Feb 19, 2024 • 13.8M • 3.21k
facebook/opt-125m

Text Generation • Updated Sep 15, 2023 • 6.64M • 243
Qwen/Qwen2.5-3B-Instruct

Text Generation • 3B • Updated Sep 25, 2024 • 10.1M • 441
meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 9.55M • • 5.7k

🧠General Purpose Dataset < 10M samples

Dataset that can 🌐chat, ⚡code and 🧮reasoning

BAAI/Infinity-Instruct

Viewer • Updated Dec 4, 2025 • 21.9M • 4.24k • 711
chargoddard/WebInstructSub-prometheus

Viewer • Updated May 15, 2024 • 2.39M • 96 • 25
arcee-ai/The-Tome

Viewer • Updated Aug 15, 2024 • 1.75M • 269 • 105

🍎 MLX-Ready LLMs

MLX weights and proven for MLX inference

mlx-community/gpt-oss-20b-MXFP4-Q8

Text Generation • 21B • Updated 29 days ago • 596k • 50
lmstudio-community/Seed-OSS-36B-Instruct-MLX-4bit

Text Generation • 36B • Updated Aug 26, 2025 • 44.7k
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bit

Text Generation • 0.6B • Updated Aug 6, 2025 • 73.6k • 11
mlx-community/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • Updated May 10, 2025 • 605k • 40

📱 OnDevice -Ready SLMs (≤4B)

Tiny, fast models that run on iPhone/iPad or Mac with very low memory. Great for quick replies, offline note-assist, and routing

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-8bit

Text Generation • 1B • Updated Aug 6, 2025 • 72.1k • 7
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-4bit

Text Generation • 1B • Updated May 29, 2025 • 346k • 8
lmstudio-community/gemma-3n-E4B-it-MLX-4bit

Image-Text-to-Text • Updated Jul 21, 2025 • 338k • 2
mlx-community/gemma-3-4b-it-qat-4bit

Image-Text-to-Text • Updated Apr 21, 2025 • 1.04M • 7

GPT2-JungleBook-from-Scratch-Models

The primary objective of project is to explore & analyze the impact of model size on text generation quality with GPT-2 arch trained from scratch.

Susant-Achary/gpt2-jungle-book-100M

Text Generation • 0.3B • Updated Jan 25, 2025
Susant-Achary/gpt2-jungle-book-59M

Text Generation • 0.2B • Updated Jan 25, 2025 • 1
Susant-Achary/gpt2-jungle-book-37M

Text Generation • 0.1B • Updated Jan 25, 2025
Susant-Achary/gpt2-jungle-book-22M

Text Generation • 81.5M • Updated Jan 25, 2025 • 1