Spaces:

Qwen
/

README

Running

Running Qwen3 Locally?

#15

by JDHayesBC - opened Dec 4, 2025

Dec 4, 2025

Has anyone had any success getting Qwen3 to run locally? I've tried many different engines to no avail (olLama, vLLM, and currently, LM Studio). I can get the model to respond for a few prompts, but as the conversation grows, the responses become garbled.

vieirae

Mar 1

I've finally managed to get everything to run (including the flash_attn 2 install on a windows machine/python 3.12/Cuda 12.8)
When running the python app.py command I get a kernel problem (although Cuda works locally - see at bottom of output)😀 (venv) c:\qwen3-tts>python app.py
Loading all models to CUDA...
Loading VoiceDesign 1.7B model...
Fetching 13 files: 100%|███████████████████████████████████████████████████████████████████████| 13/13 [00:00<?, ?it/s]
Fetching 0 files: 0it [00:00, ?it/s]
Traceback (most recent call last):
File "c:\qwen3-tts\venv\Lib\site-packages\kernels\utils.py", line 198, in install_kernel
return _find_kernel_in_repo_path(repo_path, package_name, variant_locks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\venv\Lib\site-packages\kernels\utils.py", line 220, in _find_kernel_in_repo_path
raise FileNotFoundError(
FileNotFoundError: Kernel at path C:\Users\ericv\.cache\huggingface\hub\models--kernels-community--flash-attn3\snapshots\5d9293232e0bea36728880ffeb631901a736387b does not have one of build variants: torch27-cu128-x86_64-windows, torch-cuda, torch-universal

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\qwen3-tts\venv\Lib\site-packages\transformers\integrations\hub_kernels.py", line 207, in load_and_register_kernel
kernel = get_kernel(repo_id, revision=rev)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\venv\Lib\site-packages\kernels\utils.py", line 312, in get_kernel
package_name, variant_path = install_kernel(
^^^^^^^^^^^^^^^
File "c:\qwen3-tts\venv\Lib\site-packages\kernels\utils.py", line 200, in install_kernel
raise FileNotFoundError(
FileNotFoundError: Cannot install kernel from repo kernels-community/flash-attn3 (revision: main)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\qwen3-tts\app.py", line 39, in
voice_design_model = Qwen3TTSModel.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\qwen_tts\inference\qwen3_tts_model.py", line 112, in from_pretrained
model = AutoModel.from_pretrained(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\venv\Lib\site-packages\transformers\models\auto\auto_factory.py", line 604, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\qwen_tts\core\models\modeling_qwen3_tts.py", line 1876, in from_pretrained
model = super().from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\venv\Lib\site-packages\transformers\modeling_utils.py", line 277, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\venv\Lib\site-packages\transformers\modeling_utils.py", line 4971, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\qwen_tts\core\models\modeling_qwen3_tts.py", line 1817, in init
super().init(config)
File "c:\qwen3-tts\venv\Lib\site-packages\transformers\modeling_utils.py", line 2076, in init
self.config._attn_implementation_internal = self._check_and_adjust_attn_implementation(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\qwen3-tts\venv\Lib\site-packages\transformers\modeling_utils.py", line 2684, in _check_and_adjust_attn_implementation
raise e
File "c:\qwen3-tts\venv\Lib\site-packages\transformers\modeling_utils.py", line 2668, in _check_and_adjust_attn_implementation
load_and_register_kernel(applicable_attn_implementation)
File "c:\qwen3-tts\venv\Lib\site-packages\transformers\integrations\hub_kernels.py", line 209, in load_and_register_kernel
raise ValueError(f"An error occurred while trying to load from '{repo_id}': {e}.")
ValueError: An error occurred while trying to load from 'kernels-community/flash-attn3': Cannot install kernel from repo kernels-community/flash-attn3 (revision: main).

(venv) c:\qwen3-tts>import torch
'import' is not recognized as an internal or external command,
operable program or batch file.

(venv) c:\qwen3-tts>python
Python 3.12.9 (tags/v3.12.9:fdb8142, Feb 4 2025, 15:27:58) [MSC v.1942 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.

import torch
torch.cuda.is_available()
True
x = torch.rand(5, 3)
print(x)
tensor([[0.2097, 0.9006, 0.8568],
[0.5569, 0.7345, 0.6012],
[0.8467, 0.6681, 0.4402],
[0.1582, 0.2006, 0.4323],
[0.1103, 0.7644, 0.8739]])
exit()

(venv) c:\qwen3-tts>

Does anybody have an idea what when wrong?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment