Text Generation
GGUF
Spanish
cybersecurity
spanish
tool-use
mcp
curriculum-learning
from-scratch
conversational
Instructions to use jsantillana/vectrayx-nano with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use jsantillana/vectrayx-nano with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="jsantillana/vectrayx-nano", filename="vectrayx-nano-v14-f16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use jsantillana/vectrayx-nano with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jsantillana/vectrayx-nano:F16 # Run inference directly in the terminal: llama-cli -hf jsantillana/vectrayx-nano:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jsantillana/vectrayx-nano:F16 # Run inference directly in the terminal: llama-cli -hf jsantillana/vectrayx-nano:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf jsantillana/vectrayx-nano:F16 # Run inference directly in the terminal: ./llama-cli -hf jsantillana/vectrayx-nano:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf jsantillana/vectrayx-nano:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf jsantillana/vectrayx-nano:F16
Use Docker
docker model run hf.co/jsantillana/vectrayx-nano:F16
- LM Studio
- Jan
- vLLM
How to use jsantillana/vectrayx-nano with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jsantillana/vectrayx-nano" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jsantillana/vectrayx-nano", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/jsantillana/vectrayx-nano:F16
- Ollama
How to use jsantillana/vectrayx-nano with Ollama:
ollama run hf.co/jsantillana/vectrayx-nano:F16
- Unsloth Studio new
How to use jsantillana/vectrayx-nano with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jsantillana/vectrayx-nano to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jsantillana/vectrayx-nano to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for jsantillana/vectrayx-nano to start chatting
- Docker Model Runner
How to use jsantillana/vectrayx-nano with Docker Model Runner:
docker model run hf.co/jsantillana/vectrayx-nano:F16
- Lemonade
How to use jsantillana/vectrayx-nano with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull jsantillana/vectrayx-nano:F16
Run and chat with the model
lemonade run user.vectrayx-nano-F16
List all available models
lemonade list
update README: add v7 headline results, fix all numbers, add Zenodo v2 + arXiv badge, add GGUF usage
Browse files
README.md
CHANGED
|
@@ -15,69 +15,126 @@ tags:
|
|
| 15 |
- mcp
|
| 16 |
- curriculum-learning
|
| 17 |
- from-scratch
|
|
|
|
| 18 |
---
|
| 19 |
|
| 20 |
# VectraYX-Nano
|
| 21 |
|
| 22 |
-
VectraYX-Nano is a 42M-parameter Spanish cybersecurity language model trained from scratch with curriculum learning and native Model Context Protocol (MCP) tool use.
|
| 23 |
|
| 24 |
-
|
|
|
|
|
|
|
|
|
|
| 25 |
- **Repository:** [vectrayx/vectrayx-nano-paper](https://github.com/vectrayx/vectrayx-nano-paper)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
-
|
|
|
|
|
|
|
| 30 |
|---|---|---|---|---|---|---|
|
| 31 |
-
| VectraYX-Nano
|
| 32 |
-
|
|
|
|
|
|
|
|
| 33 |
| VectraYX-Base 260M | 260M | 0.325 | 0.220 | 0.114 | 0.000 | 0.800 |
|
| 34 |
-
|
|
| 35 |
| VectraYX-Pro 3B | 3.2B | 0.341 | 0.695 | 0.686 | 0.600 | 0.800 |
|
| 36 |
| VectraYX-Pro 7B | 7B | 0.335 | 0.815 | 0.686 | 0.880 | 0.800 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
|
|
|
|
|
|
|
| 39 |
|
| 40 |
-
|
| 41 |
-
At ratio 1:21 (2,801 tool-use examples), Nano 42M achieves B4=0.145 ± 0.046 and
|
| 42 |
-
Base 260M achieves B4=0.580.
|
| 43 |
|
| 44 |
-
|
| 45 |
|
| 46 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
|
| 48 |
```python
|
| 49 |
from huggingface_hub import hf_hub_download
|
| 50 |
-
import torch
|
|
|
|
|
|
|
| 51 |
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
config_path = hf_hub_download("vectrayx/vectrayx-nano", "configs/nano.json")
|
| 56 |
```
|
| 57 |
|
| 58 |
-
|
| 59 |
|
| 60 |
-
|
| 61 |
|
| 62 |
-
|
| 63 |
-
**Authors:** Juan S. Santillana (Globant)
|
| 64 |
-
**arXiv ID:** `2605.13989`
|
| 65 |
|
| 66 |
-
|
| 67 |
-
-
|
| 68 |
-
|
| 69 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
-
|
| 72 |
|
| 73 |
## Citation
|
| 74 |
|
| 75 |
```bibtex
|
| 76 |
-
@
|
| 77 |
title = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model
|
| 78 |
with Curriculum Learning and Native Tool Use},
|
| 79 |
author = {Santillana, Juan S.},
|
| 80 |
-
|
| 81 |
-
|
|
|
|
|
|
|
|
|
|
| 82 |
}
|
| 83 |
-
```
|
|
|
|
| 15 |
- mcp
|
| 16 |
- curriculum-learning
|
| 17 |
- from-scratch
|
| 18 |
+
- arxiv:2605.13989
|
| 19 |
---
|
| 20 |
|
| 21 |
# VectraYX-Nano
|
| 22 |
|
| 23 |
+
VectraYX-Nano is a 42M-parameter Spanish cybersecurity language model trained **from scratch** with curriculum learning and native [Model Context Protocol (MCP)](https://modelcontextprotocol.io) tool use. It is, to our knowledge, the first published Spanish-native cybersecurity LLM with end-to-end MCP integration.
|
| 24 |
|
| 25 |
+
[](https://arxiv.org/abs/2605.13989)
|
| 26 |
+
[](https://doi.org/10.5281/zenodo.20122226)
|
| 27 |
+
|
| 28 |
+
- **Paper:** [VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model with Curriculum Learning and Native Tool Use](https://arxiv.org/abs/2605.13989)
|
| 29 |
- **Repository:** [vectrayx/vectrayx-nano-paper](https://github.com/vectrayx/vectrayx-nano-paper)
|
| 30 |
+
- **arXiv DOI:** https://doi.org/10.48550/arXiv.2605.13989
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## Released Model: VectraYX-Nano v7 (Headline)
|
| 35 |
|
| 36 |
+
**VectraYX-Nano v7** is the released headline model. It uses the same 42M architecture and three-phase curriculum pre-training as the v2 bootstrap-ablation reference, with the SFT corpus rebalanced to a tool-use ratio of 1:21 (vs. 1:211 in v2). This single change raises B4 (tool-selection) from 0.000 to **0.230 ± 0.052** across N=4 seeds while retaining strong CVE recall (B1=0.332±0.005) and conversational quality (B5=0.725±0.130).
|
| 37 |
+
|
| 38 |
+
Files in this repo:
|
| 39 |
+
| File | Description |
|
| 40 |
+
|---|---|
|
| 41 |
+
| `nano_sft_v7_s42.pt` | **Nano v7 seed 42 — recommended for inference** |
|
| 42 |
+
| `nano_sft_v5.pt` | Nano v2 (mixed SFT, bootstrap-ablation reference) |
|
| 43 |
+
| `vectrayx-nano-f16.gguf` | **F16 GGUF — run with llama.cpp / Ollama** |
|
| 44 |
+
| `lora/nano_lora_mini_s{42,7,13,23}.pt` | LoRA adapters (tool-use density study, ratio 1:21) |
|
| 45 |
+
| `tokenizer/vectrayx_bpe.model` | BPE-16384 tokenizer |
|
| 46 |
+
| `configs/nano.json` | Nano 42M architecture config |
|
| 47 |
+
| `configs/base.json` | Base 260M architecture config |
|
| 48 |
+
|
| 49 |
+
---
|
| 50 |
|
| 51 |
+
## Key Results (VectraYX-Bench, N=4 seeds)
|
| 52 |
+
|
| 53 |
+
| Model | Params | B1 KW | B2 F1† | B3 TM | B4 Tool | B5 Chat |
|
| 54 |
|---|---|---|---|---|---|---|
|
| 55 |
+
| **VectraYX-Nano v7** *(headline)* | 42M | **0.332±0.005** | — | — | **0.230±0.052** | 0.725±0.130 |
|
| 56 |
+
| VectraYX-Nano v2 *(bootstrap ablation)* | 42M | 0.226±0.065 | 0.199±0.004 | 0.029±0.035 | 0.000 | **0.775±0.043** |
|
| 57 |
+
| Nano LoRA mini (ratio 1:21, N=4) | 42M | 0.011±0.004 | 0.201±0.002 | 0.021±0.012 | 0.145±0.046 | 0.575±0.043 |
|
| 58 |
+
| SmolLM2-135M + LoRA-32 | 135M | 0.334 | 0.225 | 0.143 | 0.160 | 0.800 |
|
| 59 |
| VectraYX-Base 260M | 260M | 0.325 | 0.220 | 0.114 | 0.000 | 0.800 |
|
| 60 |
+
| Base 260M LoRA mini (ratio 1:21, N=4) | 260M | 0.019±0.003 | 0.203±0.002 | — | 0.445±0.201 | 0.600 |
|
| 61 |
| VectraYX-Pro 3B | 3.2B | 0.341 | 0.695 | 0.686 | 0.600 | 0.800 |
|
| 62 |
| VectraYX-Pro 7B | 7B | 0.335 | 0.815 | 0.686 | 0.880 | 0.800 |
|
| 63 |
+
| GPT-4o *(frontier reference)* | — | 0.333 | 0.110 | 0.520 | 0.615 | 0.631 |
|
| 64 |
+
|
| 65 |
+
†B2 is a benchmark artifact in this revision (key mismatch in harness, fix queued).
|
| 66 |
+
|
| 67 |
+
**B5 inversion:** Nano v7 (0.725±0.130) and Nano v2 (0.775±0.043) both **exceed GPT-4o (0.631)** on the 314-prompt held-out chat suite — the register-matched bootstrap corpus makes conversational Spanish the model's first language.
|
| 68 |
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
+
## Key Findings
|
| 72 |
|
| 73 |
+
**1. Loss-vs-register inversion.** A higher-perplexity bootstrap corpus (OpenSubtitles-ES) yields *better* post-SFT chat behavior than a lower-perplexity alternative (mC4-ES). At the nano scale, the bootstrap corpus dictates the model's default response style; SFT cannot fully overwrite it.
|
|
|
|
|
|
|
| 74 |
|
| 75 |
+
**2. Tool-use is corpus-density-gated, not capacity-gated.** The B4=0.000 floor in the mixed SFT (ratio 1:211) is a corpus-density artifact. Rebalancing to 1:21 (2,801 tool-use examples) shifts the first-token prior to `<|tool_call|>` and raises B4 to 0.230±0.052 at 42M — without retraining the backbone.
|
| 76 |
|
| 77 |
+
---
|
| 78 |
+
|
| 79 |
+
## Inference: llama.cpp / Ollama (GGUF)
|
| 80 |
+
|
| 81 |
+
```bash
|
| 82 |
+
# With Ollama
|
| 83 |
+
ollama run hf.co/jsantillana/vectrayx-nano:vectrayx-nano-f16.gguf
|
| 84 |
+
|
| 85 |
+
# With llama.cpp
|
| 86 |
+
./llama-cli -m vectrayx-nano-f16.gguf \
|
| 87 |
+
--chat-template llama3 \
|
| 88 |
+
-p "<|system|>Eres VectraYX, asistente experto en ciberseguridad para LATAM.<|end|>" \
|
| 89 |
+
-i
|
| 90 |
+
```
|
| 91 |
+
|
| 92 |
+
Runs at 6–10 tok/s on Raspberry Pi 4 and 60–100 tok/s on a laptop CPU.
|
| 93 |
+
|
| 94 |
+
---
|
| 95 |
+
|
| 96 |
+
## Inference: PyTorch
|
| 97 |
|
| 98 |
```python
|
| 99 |
from huggingface_hub import hf_hub_download
|
| 100 |
+
import torch, json, sys
|
| 101 |
+
|
| 102 |
+
sys.path.insert(0, ".") # needs training/transformer.py from vectrayx-paper-code
|
| 103 |
|
| 104 |
+
ckpt = hf_hub_download("jsantillana/vectrayx-nano", "nano_sft_v7_s42.pt")
|
| 105 |
+
tok = hf_hub_download("jsantillana/vectrayx-nano", "tokenizer/vectrayx_bpe.model")
|
| 106 |
+
cfg = hf_hub_download("jsantillana/vectrayx-nano", "configs/nano.json")
|
|
|
|
| 107 |
```
|
| 108 |
|
| 109 |
+
Full inference script at [vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code).
|
| 110 |
|
| 111 |
+
---
|
| 112 |
|
| 113 |
+
## Training Details
|
|
|
|
|
|
|
| 114 |
|
| 115 |
+
| Component | Details |
|
| 116 |
+
|---|---|
|
| 117 |
+
| Parameters | 41.95M |
|
| 118 |
+
| Architecture | Transformer decoder, GQA (8q/2kv), QK-Norm, RMSNorm, SwiGLU, RoPE, z-loss |
|
| 119 |
+
| Tokenizer | BPE-16384, byte-fallback, 50/50 conv/tech balance |
|
| 120 |
+
| Pre-training | 170M tokens, 3-phase curriculum with 25% replay buffer |
|
| 121 |
+
| SFT (v7) | 13K OASST1-ES + 4K CVE Q&A + 2.8K tool-use (ratio 1:21) |
|
| 122 |
+
| Hardware | GCP L4 24GB (pre-training) + AWS g4dn.xlarge T4 16GB (multi-seed SFT) |
|
| 123 |
+
| Cost | ~$29 USD total (corpus + training) |
|
| 124 |
|
| 125 |
+
---
|
| 126 |
|
| 127 |
## Citation
|
| 128 |
|
| 129 |
```bibtex
|
| 130 |
+
@misc{santillana2026vectrayx,
|
| 131 |
title = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model
|
| 132 |
with Curriculum Learning and Native Tool Use},
|
| 133 |
author = {Santillana, Juan S.},
|
| 134 |
+
year = {2026},
|
| 135 |
+
eprint = {2605.13989},
|
| 136 |
+
archivePrefix = {arXiv},
|
| 137 |
+
primaryClass = {cs.CL},
|
| 138 |
+
url = {https://arxiv.org/abs/2605.13989}
|
| 139 |
}
|
| 140 |
+
```
|