Unofficial Zero-Cost Guide for Running N-ATLaS: Local vs Cloud Options

by tosinamuda - opened Dec 4, 2025

Dec 4, 2025

For most people who want to run N-ATLas cost free either locally on their laptop or in the cloud, I recommend the following options:

🖥️ Local (using llama.cpp)

First, install llama.cpp and download the appropriate GGUF from tosinamuda/N-ATLaS-GGUF:

RAM	Recommended GGUF	File	Size
6-8GB	Q4_K_M	N-ATLaS-GGUF-Q4_K_M.gguf	4.92 GB
8-10GB	Q5_K_M	N-ATLaS-GGUF-Q5_K_M.gguf	5.73 GB
10-12GB	Q6_K	N-ATLaS-GGUF-Q6_K.gguf	6.6 GB
12-16GB	Q8_0	N-ATLaS-GGUF-Q8_0.gguf	8.54 GB
24GB+	F16	N-ATLaS-GGUF-F16.gguf	16.1 GB

Higher bit = better quality. Pick the largest your system can handle comfortably.

For example if you pick the 8-bit version

Then run:

llama-server -m N-ATLaS-GGUF-Q8_0.gguf --port 8080

# Basic web UI can be accessed via browser: http://localhost:8080
# Chat completion endpoint: http://localhost:8080/v1/chat/completions

This creates a local OpenAI-compatible API at http://localhost:8080 that you can use in your code.

☁️ Cloud (using Modal) (check here for step by step guide)

If you don't have the hardware, I recommend Modal over other providers (HuggingFace, Runpod, Lambda, etc.) because:

Free $30/month credit
Pay per second (only when running)
No credit card required to start

I'm using Modal with a cheaper GPU option and the FP8 quantized version (tosinamuda/N-ATLaS-FP8). Set a short idle timeout so the GPU shuts down when you're not using it.

👉 Full deployment guide: here

Drisatech

National Centre for Artificial Intelligence and Robotics org Jan 19

Good day,
I trust this message finds you well.

I tried deploying the N-ATLaS-FP8 as explained here, it actually worked testing it on Vs code terminal but when I deployed it to an existing web app using chatbot plugin, it doesn't work properly. It either responding with "Network Error" or not responding at all.

I connect the plugin to the modal server using the server URL generated when I deployed it, and the API Token generated from modal.

At this point, I don't know what to do, that's why I am reaching out.

Is there anyway you can help?

Thanks in anticipation.

tosinamuda

Jan 21

Hello Dr Idris,

Sorry for responding late - in case you still need help with this, can you share like how you are integrating with modal

Also modal usually have cold start - first request could take a while.

Drisatech

National Centre for Artificial Intelligence and Robotics org Jan 22

No problem at all for the response time.
Yes, I still need help.
I'm using a plugin to integrate the AI server (Modal) to my Web-based app.
Attached is the screenshot of the frontend and backend of the plugin.
The backend is the place where I input the modal API Key And URL.

tosinamuda

Jan 22

Do you have a /v1 in your api url?

Drisatech

National Centre for Artificial Intelligence and Robotics org Jan 22

•

edited Jan 24

No, I didn't include it on the backend plugin but it's in my API code connected to Modal. Meanwhile, i will do that now and test.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment