Update README.md

98a066e verified 2 days ago

4.39 kB

	---
	base_model: Qwen/Qwen2.5-3B-Instruct
	language:
	- es
	license: apache-2.0
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- cybersecurity
	- spanish
	- lora
	- peft
	- qwen2.5
	- arxiv:2605.13989
	---

	# VectraYX-Pro 3B

	VectraYX-Pro 3B is a LoRA-64 adapter for [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) fine-tuned on the VectraYX Spanish cybersecurity SFT corpus (~93,500 examples). It is part of the VectraYX model family presented in the paper [arXiv:2605.13989](https://arxiv.org/abs/2605.13989).

	[![arXiv](https://img.shields.io/badge/arXiv-2605.13989-b31b1b.svg)](https://arxiv.org/abs/2605.13989)
	[![Zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.20122226.svg)](https://doi.org/10.5281/zenodo.20122226)

	> This repo contains only the LoRA adapter weights (~457 MB). You need to load them on top of [Qwen/Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct).

	- Paper: [VectraYX-Nano arXiv:2605.13989](https://arxiv.org/abs/2605.13989)
	- Nano 42M (from-scratch headline model): [jsantillana/vectrayx-nano](https://huggingface.co/jsantillana/vectrayx-nano)
	- Base 260M: [jsantillana/vectrayx-base](https://huggingface.co/jsantillana/vectrayx-base)
	- Pro 7B: [jsantillana/vectrayx-pro-7b](https://huggingface.co/jsantillana/vectrayx-pro-7b)

	---

	## Results (VectraYX-Bench, single seed)

	\| Model \| Params \| B1 KW \| B2 F1 \| B3 TM \| B4 Tool \| B5 Chat \|
	\|---\|---\|---\|---\|---\|---\|---\|
	\| VectraYX-Nano v7 (headline) \| 42M \| 0.332±0.005 \| — \| — \| 0.230±0.052 \| 0.725±0.130 \|
	\| VectraYX-Base 260M \| 260M \| 0.325 \| 0.220 \| 0.114 \| 0.000 \| 0.800 \|
	\| VectraYX-Pro 3B \| 3.2B \| 0.341 \| 0.695 \| 0.686 \| 0.600 \| 0.800 \|
	\| VectraYX-Pro 7B \| 7B \| 0.335 \| 0.815 \| 0.686 \| 0.880 \| 0.800 \|
	\| GPT-4o (frontier ref.) \| — \| 0.333 \| 0.110 \| 0.520 \| 0.615 \| 0.631 \|

	---

	## What is this?

	This adapter applies the VectraYX cybersecurity specialization to Qwen2.5-3B-Instruct:
	- SFT corpus: ~93,500 examples — 13K OASST1-ES conversational + 4K CVE Q&A + 2.8K MCP tool-use traces + general cybersecurity Q&A
	- Training: LoRA rank=64, 3 epochs, lr=2e-4 on AWS SageMaker (`ml.g5.xlarge`)
	- Language: Spanish (LATAM-focused)
	- Tool use: Native MCP `<\|tool_call\|>` emission (B4=0.600)
	- Author website: https://jsantillana.com

	---

	## Usage

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load base Qwen model (requires ~6 GB VRAM for bfloat16)
	base_model = AutoModelForCausalLM.from_pretrained(
	"Qwen/Qwen2.5-3B-Instruct",
	torch_dtype="auto",
	device_map="auto"
	)

	# Load VectraYX LoRA adapter on top
	model = PeftModel.from_pretrained(base_model, "jsantillana/vectrayx-pro-3b")
	tokenizer = AutoTokenizer.from_pretrained("jsantillana/vectrayx-pro-3b")

	# Inference
	messages = [{"role": "user", "content": "¿Qué es el CVE-2021-44228 y cuál es su severidad?"}]
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(text, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### Merge adapter into base (for export / GGUF)

	```python
	merged = model.merge_and_unload()
	merged.save_pretrained("vectrayx-pro-3b-merged")
	tokenizer.save_pretrained("vectrayx-pro-3b-merged")
	```

	---

	## Model family

	\| Model \| Backbone \| Params \| B4 Tool \|
	\|---\|---\|---\|---\|
	\| [VectraYX-Nano v7](https://huggingface.co/jsantillana/vectrayx-nano) \| from-scratch \| 42M \| 0.230±0.052 \|
	\| [VectraYX-Base](https://huggingface.co/jsantillana/vectrayx-base) \| from-scratch \| 260M \| 0.000* \|
	\| VectraYX-Pro 3B \| Qwen2.5-3B-Instruct + LoRA-64 \| 3.2B \| 0.600 \|
	\| [VectraYX-Pro 7B](https://huggingface.co/jsantillana/vectrayx-pro-7b) \| Qwen2.5-7B-Instruct + QLoRA-32 \| 7B \| 0.880 \|

	*Base 260M with LoRA-16 at ratio 1:21 achieves B4=0.445±0.201.

	---

	## Citation

	```bibtex
	@misc{santillana2026vectrayx,
	title = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model
	with Curriculum Learning and Native Tool Use},
	author = {Santillana, Juan S.},
	year = {2026},
	eprint = {2605.13989},
	archivePrefix = {arXiv},
	primaryClass = {cs.CL},
	url = {https://arxiv.org/abs/2605.13989}
	}
	```