File size: 2,600 Bytes
73918b8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | ---
language:
- es
license: apache-2.0
pipeline_tag: text-generation
tags:
- cybersecurity
- spanish
- from-scratch
- curriculum-learning
- arxiv:2605.13989
---
# VectraYX-Base 260M
VectraYX-Base is a 260M-parameter Spanish cybersecurity language model trained **from scratch** using the same three-phase curriculum and replay-buffer recipe as [VectraYX-Nano](https://huggingface.co/jsantillana/vectrayx-nano), scaled to a mid-tier architecture (`d_model=1024`, `n_layers=16`).
[](https://arxiv.org/abs/2605.13989)
[](https://doi.org/10.5281/zenodo.20122226)
- **Paper:** [VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model](https://arxiv.org/abs/2605.13989)
- **Nano model:** [jsantillana/vectrayx-nano](https://huggingface.co/jsantillana/vectrayx-nano)
- **Code:** [jsantillana/vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code)
---
## Results (VectraYX-Bench, single seed)
| Model | Params | B1 KW | B3 TM | B4 Tool | B5 Chat |
|---|---|---|---|---|---|
| VectraYX-Nano v7 (N=4) | 42M | 0.332±0.005 | — | **0.230±0.052** | 0.725±0.130 |
| **VectraYX-Base 260M** | 260M | **0.325** | **0.114** | 0.000 | **0.800** |
| Base + LoRA mini (ratio 1:21, N=4) | 260M | 0.019±0.003 | — | **0.445±0.201** | 0.600 |
| VectraYX-Pro 3B | 3.2B | 0.341 | 0.686 | 0.600 | 0.800 |
B4=0.000 on mixed SFT is a **corpus-density artifact** — at ratio 1:21 (LoRA mini), Base reaches B4=0.445±0.201.
---
## Architecture
| Component | Value |
|---|---|
| Parameters | 260M |
| Layers | 16 |
| Hidden dim | 1024 |
| Attention heads | 16 (GQA 16q/4kv) |
| FFN | SwiGLU |
| Positional encoding | RoPE |
| Normalization | RMSNorm + QK-Norm |
| Tokenizer | BPE-16384 (same as Nano) |
Same architecture config as `configs/base.json` in [vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code).
---
## Files
| File | Description |
|---|---|
| `base_sft_v1_s42.pt` | Base 260M post-SFT, seed 42 (~3.1 GB) |
Training ran on AWS SageMaker `ml.g5.xlarge` (NVIDIA A10G 24GB), ~11 wall-clock hours, ~$11 USD.
---
## Citation
```bibtex
@misc{santillana2026vectrayx,
title = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model
with Curriculum Learning and Native Tool Use},
author = {Santillana, Juan S.},
year = {2026},
eprint = {2605.13989},
archivePrefix = {arXiv},
primaryClass = {cs.CL},
url = {https://arxiv.org/abs/2605.13989}
}
```
|