[Critical] Tensor Count Mismatch Error in EXAONE-4.5-33B GGUF Model

by SGWon - opened 9 days ago

Hello EXAONE Team,

I am reporting a model loading failure with the recently released EXAONE-4.5-33B-GGUF (specifically the Q4_K_M version) when using LM Studio (running on Apple M2 Max).

Problem Description
The model fails to initialize during the loading process due to a tensor count mismatch. The system expects 723 tensors as defined in the metadata, but only detects 719 tensors in the file.

Error Log
Plaintext
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 723, got 719
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model 'EXAONE-4.5-33B-Q4_K_M.gguf'

Environment
Model: EXAONE-4.5-33B-Q4_K_M.gguf
Inference Engine: LlamaV4 (llama.cpp based)

Request
This issue prevents users from utilizing the model in GGUF-supported environments like LM Studio or llama.cpp. It seems there might be an issue during the quantization process or a compatibility gap with the current exaone4 implementation in GGUF.

Could you please check the integrity of the GGUF files and provide a fix as soon as possible? Many users are looking forward to testing the performance of EXAONE 4.5.

Thank you for your hard work and for sharing this model with the community.

nuxlear

LG AI Research org 7 days ago

Hello, @SGWon . Thank you for your attention.

The current version of llama.cpp (and tools built on top of it, such as LM Studio) does not yet support the EXAONE 4.5 architecture.
While EXAONE 4.5 largely shares the same architecture as EXAONE 4.0, llama.cpp does not include support for MTP/NextN tensor handling.

MTP was first introduced in EXAONE MoE (K-EXAONE) and is also used in EXAONE 4.5.

The issue arises because EXAONE 4.5 model weights (which include MTP parameters) are being loaded with the existing EXAONE 4.0 implementation, which does not support MTP layers.
This can be resolved by updating the EXAONE 4.0 code, and the necessary changes are already included in our PR, which adds EXAONE 4.5 support to llama.cpp.

Until official support is merged and adopted by downstream tools (e.g., LM Studio, Ollama), we recommend building from our fork to properly load and use EXAONE 4.5.
Please refer to our README.md for more details.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment