YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

EXL3

Custom exl3 quantization, with 5bpw attention layers, 4bpw for the MLP layers, and an 8bpw lm_head.

Fits ~64K context in 24GB quite well.

Original Model Card:

Prompt format

<|im_start|>system
You are a helpful assistant for generating thinking process.
You are generating thinking process based on the question and existing answer.<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
{answer}

<think>
Downloads last month
1
Safetensors
Model size
9B params
Tensor type
F16
·
I16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support