SiliconMind-V1-Qwen3-8B GGUF

GGUF quantizations of AS-SiliconMind/SiliconMind-V1-Qwen3-8B, a 8B model specialized for Verilog code generation, testing, and debugging.

Quantized with llama.cpp b7437, which compatible with Ollama v0.17.4.

Available Quantizations

File Size Description
SiliconMind-V1-Qwen3-8B-F16.gguf 25 GB Full precision (F16)
SiliconMind-V1-Qwen3-8B-Q8_0.gguf 13 GB 8-bit, highest quality
SiliconMind-V1-Qwen3-8B-Q6_K.gguf 10 GB 6-bit
SiliconMind-V1-Qwen3-8B-Q5_K_M.gguf 8.8 GB 5-bit medium
SiliconMind-V1-Qwen3-8B-Q4_K_M.gguf 7.6 GB 4-bit medium (recommended)
SiliconMind-V1-Qwen3-8B-Q3_K_L.gguf 6.7 GB 3-bit large
SiliconMind-V1-Qwen3-8B-Q3_K_M.gguf 6.3 GB 3-bit medium
SiliconMind-V1-Qwen3-8B-Q3_K_S.gguf 5.7 GB 3-bit small
SiliconMind-V1-Qwen3-8B-Q2_K.gguf 5.0 GB 2-bit, smallest

Usage

ollama run hf.co/thuniverse-ai/SiliconMind-V1-Qwen3-8B-GGUF

Example prompt:

I would like you to implement a module named TopModule with the following
interface. All input and output ports are one bit unless otherwise
specified.

- input in (3 bits)
- output out (2 bits)

The module should implement a "population count" circuit that counts the
number of '1's in the input vector.
Downloads last month
45
GGUF
Model size
8B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thuniverse-ai/SiliconMind-V1-Qwen3-8B-GGUF

Finetuned
Qwen/Qwen3-8B
Quantized
(1)
this model

Collection including thuniverse-ai/SiliconMind-V1-Qwen3-8B-GGUF