--- language: - es license: apache-2.0 pipeline_tag: text-generation tags: - cybersecurity - spanish - from-scratch - curriculum-learning - arxiv:2605.13989 --- # VectraYX-Base 260M VectraYX-Base is a 260M-parameter Spanish cybersecurity language model trained **from scratch** using the same three-phase curriculum and replay-buffer recipe as [VectraYX-Nano](https://huggingface.co/jsantillana/vectrayx-nano), scaled to a mid-tier architecture (`d_model=1024`, `n_layers=16`). [![arXiv](https://img.shields.io/badge/arXiv-2605.13989-b31b1b.svg)](https://arxiv.org/abs/2605.13989) [![Zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.20122226.svg)](https://doi.org/10.5281/zenodo.20122226) - **Paper:** [VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model](https://arxiv.org/abs/2605.13989) - **Nano model:** [jsantillana/vectrayx-nano](https://huggingface.co/jsantillana/vectrayx-nano) - **Code:** [jsantillana/vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code) --- ## Results (VectraYX-Bench, single seed) | Model | Params | B1 KW | B3 TM | B4 Tool | B5 Chat | |---|---|---|---|---|---| | VectraYX-Nano v7 (N=4) | 42M | 0.332±0.005 | — | **0.230±0.052** | 0.725±0.130 | | **VectraYX-Base 260M** | 260M | **0.325** | **0.114** | 0.000 | **0.800** | | Base + LoRA mini (ratio 1:21, N=4) | 260M | 0.019±0.003 | — | **0.445±0.201** | 0.600 | | VectraYX-Pro 3B | 3.2B | 0.341 | 0.686 | 0.600 | 0.800 | B4=0.000 on mixed SFT is a **corpus-density artifact** — at ratio 1:21 (LoRA mini), Base reaches B4=0.445±0.201. --- ## Architecture | Component | Value | |---|---| | Parameters | 260M | | Layers | 16 | | Hidden dim | 1024 | | Attention heads | 16 (GQA 16q/4kv) | | FFN | SwiGLU | | Positional encoding | RoPE | | Normalization | RMSNorm + QK-Norm | | Tokenizer | BPE-16384 (same as Nano) | Same architecture config as `configs/base.json` in [vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code). --- ## Files | File | Description | |---|---| | `base_sft_v1_s42.pt` | Base 260M post-SFT, seed 42 (~3.1 GB) | Training ran on AWS SageMaker `ml.g5.xlarge` (NVIDIA A10G 24GB), ~11 wall-clock hours, ~$11 USD. --- ## Citation ```bibtex @misc{santillana2026vectrayx, title = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model with Curriculum Learning and Native Tool Use}, author = {Santillana, Juan S.}, year = {2026}, eprint = {2605.13989}, archivePrefix = {arXiv}, primaryClass = {cs.CL}, url = {https://arxiv.org/abs/2605.13989} } ```