GGUF Files for PINDARO-AI-CODE

These are the GGUF files for RthItalia/PINDARO-AI-CODE.

Downloads

GGUF Link	Quantization	Description
Download	Q2_K	Lowest quality
Download	Q3_K_S
Download	IQ3_S	Integer quant, preferable over Q3_K_S
Download	IQ3_M	Integer quant
Download	Q3_K_M
Download	Q3_K_L
Download	IQ4_XS	Integer quant
Download	Q4_K_S	Fast with good performance
Download	Q4_K_M	Recommended: Perfect mix of speed and performance
Download	Q5_K_S
Download	Q5_K_M
Download	Q6_K	Very good quality
Download	Q8_0	Best quality
Download	f16	Full precision, don't bother; use a quant

Note from Flexan

I provide GGUFs and quantizations of publicly available models that do not have a GGUF equivalent available yet, usually for models I deem interesting and wish to try out.

If there are some quants missing that you'd like me to add, you may request one in the community tab. If you want to request a public model to be converted, you can also request that in the community tab. If you have questions regarding this model, please refer to the original model repo.

You can find more info about me and what I do here.

MODEL_CARD - PINDARO AI CODE

Date: 2026-03-02 Model path: e:\Pindaro\PINDARO AI CODE

1. Model Identity

Name: PINDARO AI CODE
Family: LLaMA-style causal LM
Intended role: coding assistant
Format support:
- Hugging Face (model.safetensors)
- GGUF F16 (pindaro-f16.gguf)
- GGUF Q4_K_M (pindaro-q4_k_m.gguf)

2. Technical Specs

Architecture: LlamaForCausalLM
model_type: llama
Layers: 22
Hidden size: 2048
Attention heads: 32
KV heads: 4
Intermediate size: 5632
Max context: 2048
Vocab size: 32002
Tensor count in safetensors: 201
Parameter count (computed): 1,100,056,576
Dtype in config: float16

3. Chat / Prompt Format

Template is aligned to registered special tokens:

<|noesis|> (id 32000)
<|end|> (id 32001)

Configured template:

{{ bos_token }}{% for message in messages %}<|noesis|>
{% if message['role'] == 'system' %}### System
{{ message['content'] }}
{% elif message['role'] == 'user' %}### Question
{{ message['content'] }}
{% elif message['role'] == 'assistant' %}### Answer
{{ message['content'] }}
{% endif %}<|end|>
{% endfor %}{% if add_generation_prompt %}<|noesis|>
### Answer
```
{% endif %}

4. Local Artifact Integrity (SHA256)

model.safetensors: F77C27B8BABF9FCAB83A7DC68BA58934E8C8C031C9F10B4B73E802D4FBFE0CEC
config.json: B37C45060F3E2F5F9B91903C9CCB32F3C21076E809954FDA6C01D987CD8F25CC
generation_config.json: 6FF47E725C0EC6D0F1895670DE7EE68E61A4F99703F6C8E89AEA6AB14EA02DC3
tokenizer.json: 51433F06369AC3E597DFA23A811215E3511B8F86588A830DED72344B76A193EE
tokenizer_config.json: A0567C49A117AF9AF332874CFD333DDD622A09C5E9765131CEEE6344CB22A3DE
tokenizer.model: 9E556AFD44213B6BD1BE2B850EBBBD98F5481437A8021AFAF58EE7FB1818D347
special_tokens_map.json: D7805E093432AFCDE852968CDEBA3DE08A6FE66E77609F4701DECB87FC492F33
added_tokens.json: ECE349D292E246EAC9A9072C1730F023E61567984A828FB0D25DCCB14E3B7592
pindaro-f16.gguf: BDAAEB6FB712E9A4D952082CF415B05C7D076B33786D39063BBFB3A7E5DB2031
pindaro-q4_k_m.gguf: 5F98CC3454774ED5ED80D71A71ADFD0DAFF760FC9EEF0900DDD4F7EDA2E20FEF

5. Smoke Tests (2026-03-02)

Environment:

Python 3.11.9
Transformers 4.57.3
Torch 2.10.0+cpu

Results:

AutoConfig load: PASS
AutoTokenizer load: PASS
AutoModel load: PASS
Chat-template render: PASS
Template special-token alignment: PASS
Deterministic generation: PASS

Observed non-blocking warning:

Folder name with spaces may trigger a Python module-name warning in some runtimes.

6. Known Issues

Folder-name warning risk

PINDARO AI CODE has spaces; some tools warn on module naming.

Attention-mask warning in some calls

As pad_token equals eos_token, pass attention_mask explicitly for stable behavior.

7. Recommended Next Steps

Optional packaging cleanup

Rename folder to a no-space slug (example: PINDARO_AI_CODE) when compatible with your deployment scripts.

Add coding eval gate

HumanEval pass@1
MBPP subset
Prompt-format adherence checks

8. Usage Example

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

path = r"e:\Pindaro\PINDARO AI CODE"
tokenizer = AutoTokenizer.from_pretrained(path, local_files_only=True)
model = AutoModelForCausalLM.from_pretrained(path, local_files_only=True, dtype=torch.float16)

messages = [
    {"role": "system", "content": "You are a coding assistant."},
    {"role": "user", "content": "Write a Python function add(a, b)."},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
)
outputs = model.generate(inputs, max_new_tokens=80, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))

9. Limitations and Safety

No training-data statement is included in this folder.
No official benchmark sheet is included.
Code generation can be plausible but wrong; always run tests.

10. Release Readiness

Current status: READY FOR LOCAL USE.

Packaging/runtime blockers are resolved.
Remaining items are evaluation and packaging polish.

Downloads last month: 100

GGUF

Model size

1B params

Architecture

llama

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Model tree for Flexan/RthItalia-PINDARO-AI-CODE-GGUF

Base model

RthItalia/PINDARO-AI-CODE

Quantized

(1)

this model

Collection including Flexan/RthItalia-PINDARO-AI-CODE-GGUF

Community GGUFs

Collection

This collection contains quantized GGUF files for community models that did not have GGUF equivalents available yet. I do not own these models. • 58 items • Updated 5 days ago