vectrayx-base / README.md
jsantillana's picture
initial release: Base 260M README with results and architecture
73918b8 verified
---
language:
- es
license: apache-2.0
pipeline_tag: text-generation
tags:
- cybersecurity
- spanish
- from-scratch
- curriculum-learning
- arxiv:2605.13989
---
# VectraYX-Base 260M
VectraYX-Base is a 260M-parameter Spanish cybersecurity language model trained **from scratch** using the same three-phase curriculum and replay-buffer recipe as [VectraYX-Nano](https://huggingface.co/jsantillana/vectrayx-nano), scaled to a mid-tier architecture (`d_model=1024`, `n_layers=16`).
[![arXiv](https://img.shields.io/badge/arXiv-2605.13989-b31b1b.svg)](https://arxiv.org/abs/2605.13989)
[![Zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.20122226.svg)](https://doi.org/10.5281/zenodo.20122226)
- **Paper:** [VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model](https://arxiv.org/abs/2605.13989)
- **Nano model:** [jsantillana/vectrayx-nano](https://huggingface.co/jsantillana/vectrayx-nano)
- **Code:** [jsantillana/vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code)
---
## Results (VectraYX-Bench, single seed)
| Model | Params | B1 KW | B3 TM | B4 Tool | B5 Chat |
|---|---|---|---|---|---|
| VectraYX-Nano v7 (N=4) | 42M | 0.332±0.005 | — | **0.230±0.052** | 0.725±0.130 |
| **VectraYX-Base 260M** | 260M | **0.325** | **0.114** | 0.000 | **0.800** |
| Base + LoRA mini (ratio 1:21, N=4) | 260M | 0.019±0.003 | — | **0.445±0.201** | 0.600 |
| VectraYX-Pro 3B | 3.2B | 0.341 | 0.686 | 0.600 | 0.800 |
B4=0.000 on mixed SFT is a **corpus-density artifact** — at ratio 1:21 (LoRA mini), Base reaches B4=0.445±0.201.
---
## Architecture
| Component | Value |
|---|---|
| Parameters | 260M |
| Layers | 16 |
| Hidden dim | 1024 |
| Attention heads | 16 (GQA 16q/4kv) |
| FFN | SwiGLU |
| Positional encoding | RoPE |
| Normalization | RMSNorm + QK-Norm |
| Tokenizer | BPE-16384 (same as Nano) |
Same architecture config as `configs/base.json` in [vectrayx-paper-code](https://huggingface.co/jsantillana/vectrayx-paper-code).
---
## Files
| File | Description |
|---|---|
| `base_sft_v1_s42.pt` | Base 260M post-SFT, seed 42 (~3.1 GB) |
Training ran on AWS SageMaker `ml.g5.xlarge` (NVIDIA A10G 24GB), ~11 wall-clock hours, ~$11 USD.
---
## Citation
```bibtex
@misc{santillana2026vectrayx,
title = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model
with Curriculum Learning and Native Tool Use},
author = {Santillana, Juan S.},
year = {2026},
eprint = {2605.13989},
archivePrefix = {arXiv},
primaryClass = {cs.CL},
url = {https://arxiv.org/abs/2605.13989}
}
```