Official documentation doesn't work for CPU backend
#19
by Fl1ntSt0n3 - opened
Using the official documentation page to spawn a new local instance using CPU doesn't work:
https://speech.fish.audio/install/#docker-setup
First, it miss a paragraph explaining that you need to download the checkpoints, yes it is written on the Dockerfile but the whole point of compose it to not worry about the image itself.
So don't forget to download the checkpoints first:hf download fishaudio/s2-pro --local-dir ./checkpoints/s2-pro
Then, even if BACKEND=CPU is set, the official images can't set torchaudio properly:
2026-03-25 11:18:56.682 | INFO | __main__:<module>:74 - Decoder model loaded, warming up...
Traceback (most recent call last):
File "/app/tools/run_webui.py", line 77, in <module>
inference_engine = TTSInferenceEngine(
^^^^^^^^^^^^^^^^^^^
File "/app/fish_speech/inference_engine/__init__.py", line 32, in __init__
super().__init__()
File "/app/fish_speech/inference_engine/reference_loader.py", line 39, in __init__
backends = torchaudio.list_audio_backends()
^^^^^^^^^^
UnboundLocalError: cannot access local variable 'torchaudio' where it is not associated with a value
It needs this fix on fish_speech/inference_engine/reference_loader.py line 48:
Replace:
import torchaudio.io._load_audio_fileobj # noqa: F401
With:
from importlib import import_module
import_module("torchaudio.io._load_audio_fileobj")
I hope this will help you folks!
Fix indentation.
Fl1ntSt0n3 changed discussion status to closed