affiliation-parsing-lora-Qwen3-8B-distil-GLM_4.5_Air

A model trained to parse author names and institutional affiliations from the markdown text of arXiv preprints.

Model Details

Base model: Qwen/Qwen3-8B
Training method: Supervised fine-tuning with distillation from GLM-4.5-Air
Task: Author and affiliation extraction from arXiv preprints

Training Details

Training Data

The core training dataset is available on Hugging Face. The dataset was created by manual annotation of arXiv preprints. We split the dataset into train (56%) and test (44%).

Training Procedure

We used supervised fine-tuning with distillation, prompting GLM-4.5-Air to produce annotations for each preprint, scoring them with a reward function, and then keeping only outputs that led to correct answers. The correct rollouts are available here. We ordered training examples according to a curriculum based on the surprisal metric for the student model. Surprisal for a particular token is:

$s(w_n | w_1, \ldots, w_{n-1}) = -\log p(w_n | w_1, \ldots, w_{n-1})$

We applied this metric to the training set and ordered examples from least to most surprising for the first epoch, then interspersed examples in subsequent epochs.