AI & ML interests

🚀 Creators of Spaces with the most cumulative all-time likes at the end of each month (users only, no orgs)

fffiloni 
posted an update 10 days ago
view post
Post
3071
✨ PASD Magnify is back on Hugging Face Spaces

fffiloni/PASD

PASD isn’t recent, but still delivers strong results — worth restoring rather than replacing.

Getting it to run again wasn’t a simple dependency issue.
It relied on parts of diffusers that no longer exist, while moving to Gradio 6 forced a much newer HF stack — and I couldn’t modify the original source directly.

Recreating the old environment wasn’t practical.
So I patched the downloaded code at runtime before import and made it compatible with today’s stack.

That ended up being the only approach that held without forking or freezing everything to outdated versions.

If you’ve used it before (or are curious), feel free to give it another try.
fffiloni 
posted an update 19 days ago
view post
Post
2852
✅ Back up and running!

My TIGER app is now fully working again, with fixes and full compatibility with Gradio 6 🚀

It lets you:
- 🎙️ Separate multiple speakers from an audio file
- 🎬 Extract each speaker directly from a video
- 🎧 Split audio into dialog, music, and sound effects (DnR)
- 🎥 Apply DnR separation directly on videos

All powered by lightweight TIGER models for fast and efficient speech separation.

Try it here 👉 fffiloni/TIGER-audio-extraction
fffiloni 
posted an update 20 days ago
view post
Post
2245
AniDoc is back 🎉

I’ve fixed the Space and brought it back to life:
- ✅ Working again after being broken for a while
- ✅ Updated to Gradio 6
- ✅ Compatible with ZeroGPU
- ✅ Output videos now preserve original resolution and FPS

I also added advanced controls so you can experiment more (tracking, seed, motion, sketch).

Try it here: fffiloni/AniDoc
fffiloni 
posted an update about 1 month ago
view post
Post
4128
I brought DALL·E mini back to life 🤖🎨

You can try it here:
fffiloni/dalle-mini-reboot

And I also built a batch version using Hugging Face Jobs (up to 50 images per prompt):
fffiloni/dalle-mini-via-jobs

The goal was to stay close to the original JAX/Flax pipeline, while integrating it with modern tooling (Gradio + Jobs).

It ended up being a fun way to revisit this model — still weird, still fun 😄
  • 4 replies
·
fffiloni 
posted an update about 1 month ago
view post
Post
494
A clearer demo for TADA (now multilingual) 🔊🌍

I improved the public demo for TADA — a generative framework for speech modeling via text–acoustic dual alignment.

TADA models speech as a joint sequence of text tokens and acoustic tokens, using a transformer backbone to keep text and audio synchronized during generation.

The original demo already exposed these mechanisms, but the workflow made the pipeline hard to understand.

This updated demo makes the process clearer:

• load the model
• prepare a reference voice (optionally with transcript or Whisper auto-transcription)
• generate speech conditioned on that reference

It also adds multilingual support.

Presets are included for a few languages, but the model supports more:

English, French, Spanish, German, Arabic, Mandarin Chinese, Italian, Japanese, Polish, Portuguese

Feel free to try different voices, accents, or languages and see how the alignment behaves.

👉 fffiloni/tada-dual-alignment-tts-demo

Paper
TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment (2602.23068)
jbilcke-hf 
posted an update 6 months ago
view post
Post
6462
I made a code sniping agent to detect when new AI papers with code (and weights) are released, and then automatically create a Gradio demo on Hugging Face 🧙

Here are some examples generated 100% automatically:
https://huggingface.co/collections/jbilcke-hf/sniped

I call this agent CheatCode (https://github.com/jbilcke-hf/CheatCode) because it skips so many steps that it kinda feels like breaking the rules of the AI tech release game 😅

As with any experimental technology, there is still room for improvement 👩🏻‍🔬:

- Currently the demos are all generated in one go and not built or tested by the agent itself. A more robust version should loop over the deployed app to fix build/runtime issues.
- There is still a bit of human curation done to avoid making demos for things that can’t really be demonstrated on ZeroGPU (eg. tasks taking several minutes)
- Some papers can actually be showcased in a variety of ways, which isn’t really supported (see Demo 2)


multimodalart 
posted an update 6 months ago
view post
Post
25293
Want to iterate on a Hugging Face Space with an LLM?

Now you can easily convert any HF entire repo (Model, Dataset or Space) to a text file and feed it to a language model!

multimodalart/repo2txt
  • 1 reply
·
jbilcke-hf 
posted an update 8 months ago
view post
Post
3104
Did you know you can use AI-Toolkit by Ostris (https://github.com/ostris/ai-toolkit) to train AI image and video models directly inside a Hugging Face space?

The benefit of doing so is that you will get a nice UI, if you do not want to deal with JSON files and CLI shenanigans!

I have created a ready-to-use template you can deploy to your own HF Space to train generative models in a few clicks.

All you have to do is to duplicate my space to your private space by going here: jbilcke-hf/ai-toolkit

This space requires a good GPU and most importantly a persistent storage, as everything is stored in /data.

Currently multiple GPUs isn't supported yet in the UI but it is planned(https://discord.com/channels/1144033039786188891/1166417204015808603/1404851082361835623)

P.S.: Don't forget to set your space to private to make sure only you can access, delete and run training jobs!
  • 1 reply
·
jbilcke-hf 
posted an update 9 months ago
view post
Post
5466
Are you looking to run a robot simulator, maybe run long robot policy training tasks, but you don't have the GPU at home?

Well.. you can run MuJoCo inside a Hugging Face space!

All you have to do is to clone this space:
jbilcke-hf/train-robots-with-mujoco

Don't forget to a pick a Nvidia GPU for your space, to be able to get some nice OpenGL renders!

Are you new to MuJoCo and/or JupyterLab notebooks?

You can get started with this tutorial (select "Open from URL" then paste the URL to this notebook):
jbilcke-hf/train-robots-with-mujoco

Happy robot hacking! 🦾
  • 2 replies
·
multimodalart 
posted an update 10 months ago
view post
Post
18325
Self-Forcing - a real-time video distilled model from Wan 2.1 by @adobe is out, and they open sourced it 🐐

I've built a live real time demo on Spaces 📹💨

multimodalart/self-forcing
  • 6 replies
·
jbilcke-hf 
posted an update 11 months ago
jbilcke-hf 
posted an update 11 months ago
view post
Post
2093
Hi everyone,

I've seen some unsuccessful attempts at running Wan2GP inside a Hugging Face Space, which is a shame as it is a great Gradio app!

So here is a fork that you can use, with some instructions on how to do this:

jbilcke-hf/Wan2GP_you_must_clone_this_space_to_use_it#1

Note : some things like persistent models/storage/custom LoRAs might not be fully working out of the box. If you need those, you might have to dig into the Wan2GP codebase, see how to tweak the storage folder. Happy hacking!

fffiloni 
posted an update about 1 year ago
fffiloni 
posted an update about 1 year ago
view post
Post
3606
Explain like i'm 5 the last take from @thomwolf on X about Dario's essay on DeepSeek:

—› Open-source AI is like a big cookbook that everyone can read and improve. Instead of a few chefs keeping their recipes secret, anyone can cook, test, and invent new things.

If only one company controls AI, everything stops if they have a problem—like when the internet goes down. With open-source, many people can help, making sure it keeps running smoothly.

AI isn’t just a race between two countries; it’s a team effort around the world. By sharing, we move faster and create safer technology for everyone.

🤗
jbilcke-hf 
posted an update over 1 year ago
view post
Post
15509
Doing some testing with HunyuanVideo on the Hugging Face Inference Endpoints 🤗

prompt: "a Shiba Inu is acting as a DJ, he wears sunglasses and is mixing and scratching with vinyl discs at a Ibiza sunny sand beach party"

1280x720, 22 steps, 121 frames

There are still some things to iron out regarding speed and memory usage, right now it takes 20min on an A100 (see attached charts)

but you can check it out here:

https://huggingface.co/jbilcke-hf/HunyuanVideo-for-InferenceEndpoints

There are various things I want to try like the 100% diffusers version and other models (LTX-Video..)
fffiloni 
posted an update over 1 year ago
fffiloni 
posted an update over 1 year ago
view post
Post
20040
Visionary Walter Murch (editor for Francis Ford Coppola), in 1999:

“ So let's suppose a technical apotheosis some time in the middle of the 21st century, when it somehow becomes possible for one person to make an entire feature film, with virtual actors. Would this be a good thing?

If the history of oil painting is any guide, the broadest answer would be yes, with the obvious caution to keep a wary eye on the destabilizing effect of following too intently a hermetically personal vision. One need only look at the unraveling of painting or classical music in the 20th century to see the risks.

Let's go even further, and force the issue to its ultimate conclusion by supposing the diabolical invention of a black box that could directly convert a single person's thoughts into a viewable cinematic reality. You would attach a series of electrodes to various points on your skull and simply think the film into existence.

And since we are time-traveling, let us present this hypothetical invention as a Faustian bargain to the future filmmakers of the 21st century. If this box were offered by some mysterious cloaked figure in exchange for your eternal soul, would you take it?

The kind of filmmakers who would accept, even leap, at the offer are driven by the desire to see their own vision on screen in as pure a form as possible. They accept present levels of collaboration as the evil necessary to achieve this vision. Alfred Hitchcock, I imagine, would be one of them, judging from his description of the creative process: "The film is already made in my head before we start shooting."”

Read "A Digital Cinema of the Mind? Could Be" by Walter Murch: https://archive.nytimes.com/www.nytimes.com/library/film/050299future-film.html

  • 1 reply
·
multimodalart 
posted an update almost 2 years ago
jbilcke-hf 
posted an update almost 2 years ago
view post
Post
32658
I am working to add automatic speech bubbles to the AI Comic Factory!

This is not finished yet - there are a few things to tweak - but so far results are pretty promising!

(it's available through a "beta" toggle)
  • 10 replies
·