Building on HF

Victor Mustar PRO

victor

victormustar

AI & ML interests

Building the UX of this website

Recent Activity

reacted to Juanxi's post with 🔥 about 9 hours ago

📢 Awesome Multimodal Modeling We introduce Awesome Multimodal Modeling, a curated repository tracing the architectural evolution of multimodal intelligence—from foundational fusion to native omni-models. 🔹 Taxonomy & Evolution: Traditional Multimodal Learning – Foundational work on representation, fusion, and alignment. Multimodal LLMs (MLLMs) – Architectures connecting vision encoders to LLMs for understanding. Unified Multimodal Models (UMMs) – Models unifying Understanding + Generation via Diffusion, Autoregressive, or Hybrid paradigms. Native Multimodal Models (NMMs) – Models trained from scratch on all modalities; contrasts early vs. late fusion under scaling laws. 💡 Key Distinction: UMMs unify tasks via generation heads; NMMs enforce interleaving through joint pre-training. 🔗 Explore & Contribute: https://github.com/OpenEnvision-Lab/Awesome-Multimodal-Modeling

liked a model about 12 hours ago

MiniMaxAI/MiniMax-M2.7

liked a Space 1 day ago

manasha2006/FoodCrisisEnv

View all activity

Organizations

reactedto Juanxi's post with 🔥 about 9 hours ago

Post

2015

📢 Awesome Multimodal Modeling

We introduce Awesome Multimodal Modeling, a curated repository tracing the architectural evolution of multimodal intelligence—from foundational fusion to native omni-models.

🔹 Taxonomy & Evolution:

Traditional Multimodal Learning – Foundational work on representation, fusion, and alignment.
Multimodal LLMs (MLLMs) – Architectures connecting vision encoders to LLMs for understanding.
Unified Multimodal Models (UMMs) – Models unifying Understanding + Generation via Diffusion, Autoregressive, or Hybrid paradigms.
Native Multimodal Models (NMMs) – Models trained from scratch on all modalities; contrasts early vs. late fusion under scaling laws.
💡 Key Distinction:
UMMs unify tasks via generation heads; NMMs enforce interleaving through joint pre-training.

🔗 Explore & Contribute: https://github.com/OpenEnvision-Lab/Awesome-Multimodal-Modeling

2 replies

reactedto qgallouedec's post with 🔥 11 days ago

Post

2220

TRL v1.0 is out!

Hugging Face's TRL library is downloaded 3 million times a month. Over 130k models trained with it are public on the Hub, and major projects like @unsloth and @axolotl-ai-co build directly on top of it. v1.0 is the moment we acknowledged that responsibility explicitly, with a real stability contract.

The field hasn't settled. Building stable software in a domain that keeps invalidating its own assumptions is the actual problem we're solving. The answer is a design that can absorb the next shift without breaking what people rely on.

What's in v1.0:
Deep Hugging Face integration, low infrastructure burden
What's next: asynchronous GRPO, better scaling support, and making training legible enough that agents can inspect and steer it.

pip install --upgrade trl

Read more: hf.co/blog/trl-v1

reactedto MikeDoes's post with 👀 11 days ago

Post

607

Ai4Privacy has been working on this for the past year. 🙏

Today we're releasing the PII Masking 2M Series, the world's largest open source privacy masking dataset. (Again. 🚀🚀)

🔢 2M+ synthetic examples
🌍 32 locales across Europe
🏷️ 98 entity types
🏥💬🏦💼📍 5 industry verticals: Health, Finance, Digital, Work, Location
✅ 1M+ entries freely available on Hugging Face

Every example is 100% synthetic. No real personal data. Built so you can train and evaluate PII detection models without the legal headaches. 🔒

Thank you for 15,000,000+ downloads across our datasets, models, and libraries. This one's for you. ❤️

hashtag#privacy hashtag#ai hashtag#opensource hashtag#nlp hashtag#gdpr hashtag#pii hashtag#huggingface hashtag#machinelearning

reactedto unmodeled-tyler's post with 🚀 16 days ago

Post

1968

Hey Hugging Face!

PRODUCT HUNT LINK: https://www.producthunt.com/products/quanta-intellect?utm_source=other&utm_medium=social

I've been sharing my new AI browser Vessel the last few days and I've gotten some great feedback/interest from a lot of you!

I'm excited to announce that Vessel Browser is now live on Product Hunt! If this is the first you've heard of it, check it out! Vessel is an open source AI browser built specifically for agents on Linux. It's not a fork of an existing browser, and it doesn't assume that the human is the primary operator.

If you've already tried Vessel Browser, feel free to leave a review on Product Hunt of what you thought - or if you'd prefer, send me an email directly or reach out on twitter if you have any questions about it. I'm perpetually online & happy to chat 😀

I'm committed to building the best open source AI browser out there, and Vessel is only going to improve as time goes on!

reactedto prabhatkr's post with 👀 16 days ago

Post

1290

🚀 Is Vector RAG Dead? Why We Built FastMemory to Beat PageIndex
If you've built a RAG pipeline for complex financial documents, you already know the painful truth: Standard vector search fails when things get complicated.

While tools like PageIndex and Mafin 2.5 provide great out-of-the-box PDF chat experiences, they hit structural bottlenecks the second you push them past basic queries.

We just published a comprehensive benchmark study comparing FastMemory against PageIndex across 5 advanced datasets. The results fundamentally change how we should think about document ingestion.

Read more: https://x.com/FastBuilderAI/status/2037404008978018493

2 replies

reactedto branikita's post with 🔥 16 days ago

Post

3065

We have received the majority of components for our first small commercial batch of SO-ARM101 robotic arms. Feetech STS3250 servo drives, parallel gripper, and depth camera.

1 reply

reactedto prometechinc's post with 👀 16 days ago

Post

2290

Cicikuş v4-5B (POFUDUK Edition) is a next-generation compact language model engineered for high-efficiency reasoning, adaptive intelligence, and behavioral coherence. Built on the Gemma 4B IT foundation and enhanced through advanced LoRA optimization and selective layer reconstruction, this model delivers powerful performance without the overhead of massive parameter counts.
🔗 Explore the model: pthinc/pofuduk_cicikus_v4_5B
🧠 Why Cicikuş?
In a world dominated by massive LLMs, Cicikuş takes a different path:
⚡ Fast & Efficient — Designed for edge deployment and low-resource environments
🎯 High Reasoning Accuracy — Strong results across MMLU, GSM8K, HumanEval, and more
🧩 Behavior-Aware Intelligence — Powered by the Behavioral Consciousness Engine (BCE)
🔍 Low Hallucination Rate — ~3% with built-in ethical filtering
🌍 Multilingual Capable — Optimized for English and Turkish

reactedto prithivMLmods's post with 🔥 16 days ago

Post

5284

Flux-Klein-KV-Edit-Consistency demo is now available on Spaces. It preserves character identity and delivers high-quality, realistic results after edits. No need for any special prompts, just upload the image, type your prompt, and get the resulting image blazing fast.

🔥 Demo Space: prithivMLmods/flux-klein-kv-edit-consistency
🤗 Model: black-forest-labs/FLUX.2-klein-9b-kv
🤗 Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
🔗 Gradio Server Mode: https://www.gradio.app/main/guides/server-mode

➔ Built with Headless Gradio, an alternative to using gr.Blocks for creating the frontend and triggering events, powered by FastAPI + Gradio. You can now design the frontend however you want, with continued support for APIs, MCP, and ZeroGPU.

➔ Gradio Server Mode is now available from gradio@v6.10.0.

To learn more, visit the app page or the respective model pages.

reactedto mayafree's post with 🚀 about 1 month ago

Post

4391

I built a Space that lets you switch between all three Qwen3.5 official collection models in a single interface.

MAYA-AI/QWEN-3_5-CHAT

The architecture is the key part. Instead of using Gradio as the UI, I use it purely as an API engine. FastAPI serves a fully custom HTML/JS frontend that calls /gradio_api/call/chat via SSE streaming. No DOM conflicts, no layout constraints.

Four main features: instant model switching with automatic spec adjustment (max tokens, temperature ceiling, Vision availability all update per model), Thinking Mode via /think prefix with collapsible reasoning chain, Vision image upload via base64 conversion, and HF OAuth implemented directly at the FastAPI level.

For model selection: 122B-A10B with Thinking Mode for math, logic, and agents. 27B for writing, translation, and instruction following. 35B-A3B for fast everyday questions.

A few surprises during development — Gradio 6.x removed several parameters quietly, base64 image strings broke gr.Image(type="pil") so I switched to gr.Textbox with backend PIL conversion, and Thinking Mode parsing needed a full rewrite with indexOf instead of regex.

Thanks to the Qwen team for making this possible. Try it out and let me know what you think.

#Qwen3 #Qwen35 #OpenSourceAI #HuggingFace #LLM #ThinkingAI #vidraft #MultimodalAI

reactedto SeaWolf-AI's post with 🔥 about 1 month ago

Post

5011

ALL Bench — Global AI Model Unified Leaderboard

FINAL-Bench/all-bench-leaderboard

If you've ever tried to compare GPT-5.2 and Claude Opus 4.6 side by side, you've probably hit the same wall: the official Hugging Face leaderboard only tracks open-source models, so the most widely used AI systems simply aren't there. ALL Bench fixes that by bringing closed-source models, open-weight models, and — uniquely — all four teams under South Korea's national sovereign AI program into a single leaderboard. Thirty-one frontier models, one consistent scoring scale.
Scoring works differently here too. Most leaderboards skip benchmarks a model hasn't submitted, which lets models game their ranking by withholding results. ALL Bench treats every missing entry as zero and divides by ten, so there's no advantage in hiding your weak spots.
The ten core benchmarks span reasoning (GPQA Diamond, AIME 2025, HLE, ARC-AGI-2), coding (SWE-bench Verified, LiveCodeBench), and instruction-following (IFEval, BFCL). The standout is FINAL Bench — the world's only benchmark measuring whether a model can catch and correct its own mistakes. It reached rank five in global dataset popularity on Hugging Face in February 2026 and has been covered by Seoul Shinmun, Asia Economy, IT Chosun, and Behind.
Nine interactive charts let you explore everything from composite score rankings and a full heatmap to an open-vs-closed scatter plot. Operational metrics like context window, output speed, and pricing are included alongside benchmark scores.
All data is sourced from Artificial Analysis Intelligence Index v4.0, arXiv technical reports, Chatbot Arena ELO ratings, and the Korean Ministry of Science and ICT's official evaluation results. Updates monthly.

reactedto danielhanchen's post with 🔥🤗 about 1 month ago

Post

5301

Qwen releases 4 new Qwen3.5 Small models: 0.8B • 2B • 4B • 9B!

Run Qwen3.5-0.8B, 2B and 4B on your phone. Run 9B on 6GB RAM.

The vision reasoning LLMs perform better than models 4x their size.

GGUFs to run: https://huggingface.co/collections/unsloth/qwen35

Guide: https://unsloth.ai/docs/models/qwen3.5

5 replies

reactedto darkc0de's post with 🔥 about 2 months ago

Post

9640

1440GB of VRAM is incredibly satisfying 😁

17 replies

reactedto mitkox's post with 🔥 about 2 months ago

Post

5484

My USB charger has a Blackwell GPU and 128GB RAM.
What. A. Time. To. Be. Alive.
People in Sofia: “It’s freezing.”
Me: sitting next to 3kW of space AI heaters on my desk 👀
1x GLM-5, 2x MiniMax-M2.5, 1x Qwen3 Coder Next; all on single Aibrix/K8s cluster

6 replies

reactedto MonsterMMORPG's post with 👀 about 2 months ago

Post

2243

SECourses Ultimate Video and Image Upscaler Pro is now V2.1 and massive improvements has arrived

Check all below screenshots to see all amazing features

20 Feburary 2026 Update V2.1
This is a pretty big update

We have 100% changed the FlashVSR+ backend to a new repo and I have significantly upgraded this repo

The new FlashVSR+ works amazing and I think it is better than SeedVR2 for high res videos upscale like upscaling 720p into higher resolution

Top menu navigation bar updated into a better version and view

FlashVSR+ tab remade and all the features are now working

For lower VRAM a button is added which you can use if you get OOM

Read the updated UI to understand how to use

FlashVSR+ now can upscale images very well as well

Image Based GAN upscalers tab also improved and some bugs fixed

Output & Comparison tab Video Output was not working properly and this issue fix fixed

In Output & Comparison tab, new multi video and multi image comparison sliders added which is super useful to quickly compare multiple videos and images

Lots of various bug fixes made

App is getting closer to be perfect please heavily test it and let me know errors and what features you request

This update was mostly about improving the FlashVSR+ since it is a very fast and amazing video upscaler model

Image Based - Gan upscale now can upscale videos perfectly fine and Batch Size (Frames per Iteration) is now working to speed up upscaling videos

For updating, get the latest zip file, extract and overwrite all files and run Windows_Run_SECourses_Upscaler_Pro.bat file

1 reply

reactedto Ujjwal-Tyagi's post with 🔥 about 2 months ago

Post

3030

GLM 5 is insane, it ranks #4 Globally!

4 replies

reactedto krisbailey's post with 👀 about 2 months ago

Post

483

While doing various projects I kept running into situations where I wanted to be able to have representative samples of some of the current large SOTA datasets that were smaller so I didn't need to worry about slicing or anything else at runtime. So, I created sub datasets making sure to keep the same ratios of data sources. Each dataset card provides info for what's in it.
100M token datasets:
RedPajama v2 100M
Falcon RefinedWeb 100M
Cosmopedia 100M

1B token datasets:
Fineweb-edu 1B
RedPajama v1 1B
RedPajama v2 1B (use this one)
Cosmopedia 1B

10B token datasets:
RedPajama v1 10B
Cosmopedia 10B

Collection here:
https://huggingface.co/collections/krisbailey/bite-size-data

reactedto danielhanchen's post with 🔥 about 2 months ago

Post

8541

You can now run MiniMax-2.5 locally! 🚀
At 230B parameters, MiniMax-2.5 is the strongest LLM under 700B params, delivering SOTA agentic coding & chat.

Run Dynamic 3/4-bit on a 128GB Mac for 20 tokens/s.
Guide: https://unsloth.ai/docs/models/minimax-2.5
GGUF: unsloth/MiniMax-M2.5-GGUF

1 reply

reactedto kostakoff's post with 🚀 about 2 months ago

Post

3345

My home lab for AI models - llmlaba v1

After I began learning MLOps I realized that I needed some kind of home lab, there are a lot of GPUs that I need to learn how to set up and test.
So I spent some time to do a researching which platform I could buy or build.
My requirements ware:
- Limited budget
- Power supply 1 kW or higher
- Few PCIe slots to be able to install more than one gpu
- Zero maintenance cost, I don't want spend a lot of time or money to maintain lab hardware, except for the GPUs

I chose the Intel Mac Pro 7.1:
- Prices on eBay acceptable
- Excelent cooling
- 1.4 kW power supply
- 7 PCIe slots
- Zero maintenance: I don't need to do anything with the Mac Pro hardware; it just works
- Classic UEFI boot loader

It requires a bit of OS preparation:
1. Install Ubuntu 24.04 (it works with the general PC ISO image)
2. Set up T2 drivers

sudo apt install -y dkms linux-headers-$(uname -r) applesmc-t2 apple-bce lm-sensors

3. Install t2fanrd to manually manage fans (/etc/t2fand.conf) https://wiki.t2linux.org/guides/fan/
4. Fix PCIe BAR: add pci=realloc to GRUB_CMDLINE_LINUX_DEFAULT so the Linux kernel will properly initializes server GPUs without Graphics Output Protocol
5. Install NVIDIA GPU driver:

sudo apt install nvidia-driver-570

And it works!
I was able to run server-grade Nvidia Tesla P100 (required DIY air duct), and consumer Nvidia Titan X, Titan V, GTX 1080 cards on the old Mac Pro 7.1 - even three in parallel.

llmlaba

3 replies

reactedto AdinaY's post with 🔥 about 2 months ago

Post

4084

Game on 🎮🚀

While Seedance 2.0’s videos are all over the timeline, DeepSeek quietly pushed a new model update in its app.

GLM-5 from Z.ai adds more momentum.

Ming-flash-omni from Ant Group , MiniCPM-SALA from OpenBMB
, and the upcoming MiniMax M2.5 keep the heat on 🔥

Spring Festival is around the corner,
no one’s sleeping!

✨ More releases coming, stay tuned
https://huggingface.co/collections/zh-ai-community/2026-february-china-open-source-highlights

Victor Mustar PRO

AI & ML interests

Recent Activity

Organizations

victor's activity