A3-Qwen3.5-2B

💾 Code	📄 Paper	🌐 Website
🤗 Dataset	🤖 Models	📦 PyPI

Structured Distillation of Web Agent Capabilities Enables Generalization

Xing Han Lù, Siva Reddy

A3-Qwen3.5-2B is a 2B multimodal web agent fine-tuned from Qwen/Qwen3.5-2B using the Agent-as-Annotators (A3) framework. It is trained on A3-Synth, a dataset of high-quality synthetic trajectories generated through a structured teacher-student distillation process.

Model Description

A3-Qwen3.5-2B is designed to navigate complex web environments by processing visual screenshots and text. By decomposing the synthetic data generation process into three modular roles—Task Designer, Annotator, and Supervisor—the A3 framework allows small, locally deployable models to achieve competitive performance on benchmarks like WebArena, even surpassing some larger closed-source models.

Quick Start: Evaluation

You can evaluate the model using the agent-as-annotators toolkit:

1. Serve the model with vLLM

vllm serve --model McGill-NLP/A3-Qwen3.5-2B

2. Run evaluation

a3-eval --benchmark webarena_test --model A3-qwen3.5-2b

Citation

If you find this model useful, please cite our work:

@misc{lu2025structured,
      title={Structured Distillation of Web Agent Capabilities Enables Generalization}, 
      author={Xing Han Lù and Siva Reddy},
      year={2025},
      eprint={2604.07776},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}