Smaller GGUF version?
- Can you add smaller GGUF versions of these models, especially Text Encoder ?
- Is it correct that your Main model is only 1.9Gb while Text Encoder is 18.8Gb ?!?!
It seems strange that model is smaller size than text encoder!
got prompt
BitDanceLoader: loading components (main=BitDance_14B_MainModel_FP8.safetensors, text=BitDance_TextEncoder_FP8.safetensors, vae=BitDance_VAE_FP16.safetensors). Text encoder single-file path uses streaming load to reduce CPU RAM spikes.
BitDanceQwenFP8: building Qwen text encoder with meta init (accelerate).
BitDanceQwenFP8: replaced 280 linear layers with FP8-capable modules.
BitDanceQwenFP8: streamed weights loaded (fp8_linear=180, dense_linear=100, other=162, unexpected=1).
BitDanceLoader: placing components -> model:cpu text:cuda:0 vae:cpu
BitDanceSampler: moving text encoder runtime to cuda:0 (this can take time for 14B).
BitDanceSampler: start sampling 512x512, blocks=16, diffusion_steps=20, cfg=7.50, images=1
BitDanceSampler: sampler=euler_maruyama
BitDanceSampler: 0%| | 0/20 [00:00<?, ?it/s]!!! Exception during processing !!! The expanded size of the tensor (69) must match the existing size (86) at non-singleton dimension 3. Target sizes: [1, 40, 64, 69]. Tensor sizes: [1, 1, 64, 86]
Traceback (most recent call last):
File "F:\ComfyUI_windows_portable\ComfyUI\execution.py", line 530, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\ComfyUI\execution.py", line 334, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\ComfyUI\execution.py", line 308, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "F:\ComfyUI_windows_portable\ComfyUI\execution.py", line 296, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\ComfyUI\custom_nodes\Comfyui-bitdance\nodes.py", line 2075, in sample
outputs_u = base_llm(
^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 1001, in wrapper
outputs = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\qwen3\modeling_qwen3.py", line 435, in forward
hidden_states = decoder_layer(
^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\modeling_layers.py", line 93, in call
return super().call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\qwen3\modeling_qwen3.py", line 323, in forward
hidden_states, _ = self.self_attn(
^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\qwen3\modeling_qwen3.py", line 280, in forward
attn_output, attn_weights = attention_interface(
^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\integrations\sdpa_attention.py", line 92, in sdpa_attention_forward
attn_output = torch.nn.functional.scaled_dot_product_attention(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The expanded size of the tensor (69) must match the existing size (86) at non-singleton dimension 3. Target sizes: [1, 40, 64, 69]. Tensor sizes: [1, 1, 64, 86]
Prompt executed in 21.50 seconds
RuntimeError: The expanded size of the tensor (76) must match the existing size (86) at non-singleton dimension 3. Target sizes: [1, 40, 64, 76]. Tensor sizes: [1, 1, 64, 86]
Me too! :)