File size: 2,600 Bytes
73918b8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
language:
- es
license: apache-2.0
pipeline_tag: text-generation
tags:
- cybersecurity
- spanish
- from-scratch
- curriculum-learning
- arxiv:2605.13989
---

# VectraYX-Base 260M

VectraYX-Base is a 260M-parameter Spanish cybersecurity language model trained **from scratch** using the same three-phase curriculum and replay-buffer recipe as [VectraYX-Nano](https://huggingface.co/jsantillana/vectrayx-nano), scaled to a mid-tier architecture (`d_model=1024`, `n_layers=16`).

[![arXiv](https://img.shields.io/badge/arXiv-2605.13989-b31b1b.svg)](https://arxiv.org/abs/2605.13989)
[![Zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.20122226.svg)](https://doi.org/10.5281/zenodo.20122226)

- **Paper:** [VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model](https://arxiv.org/abs/2605.13989)
- **Nano model:** [jsantillana/vectrayx-nano](https://huggingface.co/jsantillana/vectrayx-nano)
- **Code:** [jsantillana/vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code)

---

## Results (VectraYX-Bench, single seed)

| Model | Params | B1 KW | B3 TM | B4 Tool | B5 Chat |
|---|---|---|---|---|---|
| VectraYX-Nano v7 (N=4) | 42M | 0.332±0.005 | — | **0.230±0.052** | 0.725±0.130 |
| **VectraYX-Base 260M** | 260M | **0.325** | **0.114** | 0.000 | **0.800** |
| Base + LoRA mini (ratio 1:21, N=4) | 260M | 0.019±0.003 | — | **0.445±0.201** | 0.600 |
| VectraYX-Pro 3B | 3.2B | 0.341 | 0.686 | 0.600 | 0.800 |

B4=0.000 on mixed SFT is a **corpus-density artifact** — at ratio 1:21 (LoRA mini), Base reaches B4=0.445±0.201.

---

## Architecture

| Component | Value |
|---|---|
| Parameters | 260M |
| Layers | 16 |
| Hidden dim | 1024 |
| Attention heads | 16 (GQA 16q/4kv) |
| FFN | SwiGLU |
| Positional encoding | RoPE |
| Normalization | RMSNorm + QK-Norm |
| Tokenizer | BPE-16384 (same as Nano) |

Same architecture config as `configs/base.json` in [vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code).

---

## Files

| File | Description |
|---|---|
| `base_sft_v1_s42.pt` | Base 260M post-SFT, seed 42 (~3.1 GB) |

Training ran on AWS SageMaker `ml.g5.xlarge` (NVIDIA A10G 24GB), ~11 wall-clock hours, ~$11 USD.

---

## Citation

```bibtex
@misc{santillana2026vectrayx,
  title     = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model
               with Curriculum Learning and Native Tool Use},
  author    = {Santillana, Juan S.},
  year      = {2026},
  eprint    = {2605.13989},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CL},
  url       = {https://arxiv.org/abs/2605.13989}
}
```