Ollama - error

#2
by AizenYPB - opened

Hi,

I used your recommended command to run this on ollama: ollama run hf.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-mlx-4bit = pulling manifest
Error: pull model manifest: 400: {"error":"Repository is not GGUF or is not compatible with llama.cpp"}

Hey, thanks for trying the model!

This is an MLX-format model optimized for Apple Silicon, so it's not directly compatible with Ollama (which requires GGUF/llama.cpp format). I'm removing the ollama tag to avoid confusion.

To run this model now:

pip install mlx-lm
mlx_lm.generate --model deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-mlx-4bit --prompt "Hello"

I'm working on a GGUF version that will support Ollama and LM Studio natively β€” will update here when it's available.

Update: The GGUF version is now live β€” this will work with Ollama:

ollama run hf.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-GGUF

πŸ‘‰ gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-GGUF

Available: Q4_K_M (2.7 GB), Q5_K_M (3.1 GB), Q8_0 (4.5 GB), F16 (8.3 GB).

Update: The GGUF version is now live! πŸŽ‰

ollama run hf.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-GGUF

Available quantizations: Q4_K_M (2.7 GB), Q5_K_M (3.1 GB), Q8_0 (4.5 GB), F16 (8.3 GB)

πŸ‘‰ GGUF repo

Closing this out β€” the GGUF repo is live, tested, and working with Ollama. Thanks for reporting!

ollama run hf.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-GGUF
deadbydawn101 changed discussion status to closed

Sign up or log in to comment