Smaller GGUF version?

by AKDesign - opened Feb 23

Discussion

AKDesign

Feb 23

•

edited Feb 23

Can you add smaller GGUF versions of these models, especially Text Encoder ?
Is it correct that your Main model is only 1.9Gb while Text Encoder is 18.8Gb ?!?!
It seems strange that model is smaller size than text encoder!

andikakamal

Feb 23

got prompt
BitDanceLoader: loading components (main=BitDance_14B_MainModel_FP8.safetensors, text=BitDance_TextEncoder_FP8.safetensors, vae=BitDance_VAE_FP16.safetensors). Text encoder single-file path uses streaming load to reduce CPU RAM spikes.
BitDanceQwenFP8: building Qwen text encoder with meta init (accelerate).
BitDanceQwenFP8: replaced 280 linear layers with FP8-capable modules.
BitDanceQwenFP8: streamed weights loaded (fp8_linear=180, dense_linear=100, other=162, unexpected=1).
BitDanceLoader: placing components -> model:cpu text:cuda:0 vae:cpu
BitDanceSampler: moving text encoder runtime to cuda:0 (this can take time for 14B).
BitDanceSampler: start sampling 512x512, blocks=16, diffusion_steps=20, cfg=7.50, images=1
BitDanceSampler: sampler=euler_maruyama
BitDanceSampler: 0%| | 0/20 [00:00<?, ?it/s]!!! Exception during processing !!! The expanded size of the tensor (69) must match the existing size (86) at non-singleton dimension 3. Target sizes: [1, 40, 64, 69]. Tensor sizes: [1, 1, 64, 86]
Traceback (most recent call last):
File "F:\ComfyUI_windows_portable\ComfyUI\execution.py", line 530, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\ComfyUI\execution.py", line 334, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\ComfyUI\execution.py", line 308, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "F:\ComfyUI_windows_portable\ComfyUI\execution.py", line 296, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\ComfyUI\custom_nodes\Comfyui-bitdance\nodes.py", line 2075, in sample
outputs_u = base_llm(
^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\utils\generic.py", line 1001, in wrapper
outputs = func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\qwen3\modeling_qwen3.py", line 435, in forward
hidden_states = decoder_layer(
^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\modeling_layers.py", line 93, in call
return super().call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\qwen3\modeling_qwen3.py", line 323, in forward
hidden_states, _ = self.self_attn(
^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\models\qwen3\modeling_qwen3.py", line 280, in forward
attn_output, attn_weights = attention_interface(
^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\transformers\integrations\sdpa_attention.py", line 92, in sdpa_attention_forward
attn_output = torch.nn.functional.scaled_dot_product_attention(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The expanded size of the tensor (69) must match the existing size (86) at non-singleton dimension 3. Target sizes: [1, 40, 64, 69]. Tensor sizes: [1, 1, 64, 86]

Prompt executed in 21.50 seconds

Spawn

Feb 23

RuntimeError: The expanded size of the tensor (76) must match the existing size (86) at non-singleton dimension 3. Target sizes: [1, 40, 64, 76]. Tensor sizes: [1, 1, 64, 86]
Me too! :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment