| --- |
| language: |
| - es |
| license: apache-2.0 |
| pipeline_tag: text-generation |
| tags: |
| - cybersecurity |
| - spanish |
| - from-scratch |
| - curriculum-learning |
| - arxiv:2605.13989 |
| --- |
| |
| # VectraYX-Base 260M |
|
|
| VectraYX-Base is a 260M-parameter Spanish cybersecurity language model trained **from scratch** using the same three-phase curriculum and replay-buffer recipe as [VectraYX-Nano](https://huggingface.co/jsantillana/vectrayx-nano), scaled to a mid-tier architecture (`d_model=1024`, `n_layers=16`). |
|
|
| [](https://arxiv.org/abs/2605.13989) |
| [](https://doi.org/10.5281/zenodo.20122226) |
|
|
| - **Paper:** [VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model](https://arxiv.org/abs/2605.13989) |
| - **Nano model:** [jsantillana/vectrayx-nano](https://huggingface.co/jsantillana/vectrayx-nano) |
| - **Code:** [jsantillana/vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code) |
|
|
| --- |
|
|
| ## Results (VectraYX-Bench, single seed) |
|
|
| | Model | Params | B1 KW | B3 TM | B4 Tool | B5 Chat | |
| |---|---|---|---|---|---| |
| | VectraYX-Nano v7 (N=4) | 42M | 0.332±0.005 | — | **0.230±0.052** | 0.725±0.130 | |
| | **VectraYX-Base 260M** | 260M | **0.325** | **0.114** | 0.000 | **0.800** | |
| | Base + LoRA mini (ratio 1:21, N=4) | 260M | 0.019±0.003 | — | **0.445±0.201** | 0.600 | |
| | VectraYX-Pro 3B | 3.2B | 0.341 | 0.686 | 0.600 | 0.800 | |
|
|
| B4=0.000 on mixed SFT is a **corpus-density artifact** — at ratio 1:21 (LoRA mini), Base reaches B4=0.445±0.201. |
|
|
| --- |
|
|
| ## Architecture |
|
|
| | Component | Value | |
| |---|---| |
| | Parameters | 260M | |
| | Layers | 16 | |
| | Hidden dim | 1024 | |
| | Attention heads | 16 (GQA 16q/4kv) | |
| | FFN | SwiGLU | |
| | Positional encoding | RoPE | |
| | Normalization | RMSNorm + QK-Norm | |
| | Tokenizer | BPE-16384 (same as Nano) | |
|
|
| Same architecture config as `configs/base.json` in [vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code). |
|
|
| --- |
|
|
| ## Files |
|
|
| | File | Description | |
| |---|---| |
| | `base_sft_v1_s42.pt` | Base 260M post-SFT, seed 42 (~3.1 GB) | |
|
|
| Training ran on AWS SageMaker `ml.g5.xlarge` (NVIDIA A10G 24GB), ~11 wall-clock hours, ~$11 USD. |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{santillana2026vectrayx, |
| title = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model |
| with Curriculum Learning and Native Tool Use}, |
| author = {Santillana, Juan S.}, |
| year = {2026}, |
| eprint = {2605.13989}, |
| archivePrefix = {arXiv}, |
| primaryClass = {cs.CL}, |
| url = {https://arxiv.org/abs/2605.13989} |
| } |
| ``` |
|
|