deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-mlx-4bit · Ollama

Ollama - error

by AizenYPB - opened 8 days ago

Hi,

I used your recommended command to run this on ollama: ollama run hf.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-mlx-4bit = pulling manifest
Error: pull model manifest: 400: {"error":"Repository is not GGUF or is not compatible with llama.cpp"}

deadbydawn101

Owner 8 days ago

Hey, thanks for trying the model!

This is an MLX-format model optimized for Apple Silicon, so it's not directly compatible with Ollama (which requires GGUF/llama.cpp format). I'm removing the ollama tag to avoid confusion.

To run this model now:

pip install mlx-lm
mlx_lm.generate --model deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-mlx-4bit --prompt "Hello"

I'm working on a GGUF version that will support Ollama and LM Studio natively — will update here when it's available.

deadbydawn101

Owner 8 days ago

Update: The GGUF version is now live — this will work with Ollama:

ollama run hf.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-GGUF

👉 gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-GGUF

Available: Q4_K_M (2.7 GB), Q5_K_M (3.1 GB), Q8_0 (4.5 GB), F16 (8.3 GB).

deadbydawn101

Owner 8 days ago

Update: The GGUF version is now live! 🎉

ollama run hf.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-GGUF

Available quantizations: Q4_K_M (2.7 GB), Q5_K_M (3.1 GB), Q8_0 (4.5 GB), F16 (8.3 GB)

👉 GGUF repo

deadbydawn101

Owner 7 days ago

Closing this out — the GGUF repo is live, tested, and working with Ollama. Thanks for reporting!

ollama run hf.co/deadbydawn101/gemma-4-E4B-Agentic-Opus-Reasoning-GeminiCLI-GGUF

deadbydawn101 changed discussion status to closed 7 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment