Problems with gemma-4-31B-it-GGUF UD-Q4_K_XL on MacBook M3 Pro 36GB in Unsloth Studio
{"timestamp": "2026-04-09T09:25:07.765274Z", "level": "error", "event": "Error during GGUF tool streaming: llama-server returned 500: {"error":{"code":500,"message":"Compute error.","type":"server_error"}}\nTraceback (most recent call last):\n File "/Users/xxxx/.unsloth/studio/unsloth_studio/lib/python3.13/site-packages/studio/backend/routes/inference.py", line 1232, in gguf_tool_stream\n event = await asyncio.to_thread(next, gen, _tool_sentinel)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/Users/xxxx/.local/share/uv/python/cpython-3.13.12-macos-aarch64-none/lib/python3.13/asyncio/threads.py", line 25, in to_thread\n return await loop.run_in_executor(None, func_call)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/Users/xxxx/.local/share/uv/python/cpython-3.13.12-macos-aarch64-none/lib/python3.13/asyncio/futures.py", line 286, in __await__\n yield self # This tells Task to wait for completion.\n ^^^^^^^^^^\n File "/Users/xxxx/.local/share/uv/python/cpython-3.13.12-macos-aarch64-none/lib/python3.13/asyncio/tasks.py", line 375, in __wakeup\n future.result()\n ~~~~~~~~~~~~~^^\n File "/Users/xxxx/.local/share/uv/python/cpython-3.13.12-macos-aarch64-none/lib/python3.13/asyncio/futures.py", line 199, in result\n raise self._exception.with_traceback(self._exception_tb)\n File "/Users/xxxx/.local/share/uv/python/cpython-3.13.12-macos-aarch64-none/lib/python3.13/concurrent/futures/thread.py", line 59, in run\n result = self.fn(*self.args, **self.kwargs)\n File "/Users/xxxx/.unsloth/studio/unsloth_studio/lib/python3.13/site-packages/studio/backend/core/inference/llama_cpp.py", line 2442, in generate_chat_completion_with_tools\n raise RuntimeError(\n ...<2 lines>...\n )\nRuntimeError: llama-server returned 500: {"error":{"code":500,"message":"Compute error.","type":"server_error"}}\n"}
tried:
- restarting studio
- change configuration to one of the presets in studio






