Text Generation
Transformers
Safetensors
English
qwen3
tool-use
agentic-rl
environment-synthesis
EnvFactory
conversational
text-generation-inference
Instructions to use LARK-Lab/EnvFactory-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LARK-Lab/EnvFactory-8B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="LARK-Lab/EnvFactory-8B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("LARK-Lab/EnvFactory-8B") model = AutoModelForCausalLM.from_pretrained("LARK-Lab/EnvFactory-8B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use LARK-Lab/EnvFactory-8B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "LARK-Lab/EnvFactory-8B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LARK-Lab/EnvFactory-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/LARK-Lab/EnvFactory-8B
- SGLang
How to use LARK-Lab/EnvFactory-8B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "LARK-Lab/EnvFactory-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LARK-Lab/EnvFactory-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "LARK-Lab/EnvFactory-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LARK-Lab/EnvFactory-8B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use LARK-Lab/EnvFactory-8B with Docker Model Runner:
docker model run hf.co/LARK-Lab/EnvFactory-8B
Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,148 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: transformers
|
| 3 |
+
tags:
|
| 4 |
+
- tool-use
|
| 5 |
+
- agentic-rl
|
| 6 |
+
- environment-synthesis
|
| 7 |
+
- EnvFactory
|
| 8 |
+
license: apache-2.0
|
| 9 |
+
datasets:
|
| 10 |
+
- LARK-Lab/EnvFactory-RL
|
| 11 |
+
- LARK-Lab/EnvFactory-SFT-FILTERED
|
| 12 |
+
language:
|
| 13 |
+
- en
|
| 14 |
+
base_model:
|
| 15 |
+
- Qwen/Qwen3-8B
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
<h2 align="center">
|
| 19 |
+
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
|
| 20 |
+
</h2>
|
| 21 |
+
|
| 22 |
+
<p align="center">
|
| 23 |
+
<a href="https://arxiv.org/abs/2605.18703">
|
| 24 |
+
<img
|
| 25 |
+
src="https://img.shields.io/badge/Paper-Arxiv-red?logo=arxiv&logoColor=red"
|
| 26 |
+
alt="EnvFactory Paper on arXiv"
|
| 27 |
+
/>
|
| 28 |
+
</a>
|
| 29 |
+
<a href="https://github.com/LARK-AI-Lab/EnvFactory">
|
| 30 |
+
<img
|
| 31 |
+
src="https://img.shields.io/badge/GitHub-Code-181717?logo=github&logoColor=white"
|
| 32 |
+
alt="GitHub Code"
|
| 33 |
+
/>
|
| 34 |
+
</a>
|
| 35 |
+
<a href="https://lark-ai-lab.github.io/envfactory.github.io/">
|
| 36 |
+
<img
|
| 37 |
+
src="https://img.shields.io/badge/GitHub-Page-4078c0?logo=github&logoColor=white"
|
| 38 |
+
alt="GitHub Page"
|
| 39 |
+
/>
|
| 40 |
+
</a>
|
| 41 |
+
<a href="https://huggingface.co/collections/LARK-Lab/envfactory">
|
| 42 |
+
<img
|
| 43 |
+
src="https://img.shields.io/badge/Datasets-Hugging%20Face%20Data-orange?logo=huggingface&logoColor=yellow"
|
| 44 |
+
alt="Datasets on Hugging Face"
|
| 45 |
+
/>
|
| 46 |
+
</a>
|
| 47 |
+
<a href="https://huggingface.co/collections/LARK-Lab/envfactory">
|
| 48 |
+
<img
|
| 49 |
+
src="https://img.shields.io/badge/EnvFactory-Hugging%20Face%20Model-FFCC00?logo=huggingface&logoColor=yellow"
|
| 50 |
+
alt="EnvFactory on Hugging Face"
|
| 51 |
+
/>
|
| 52 |
+
</a>
|
| 53 |
+
</p>
|
| 54 |
+
|
| 55 |
+
## Overview
|
| 56 |
+
|
| 57 |
+
We propose **EnvFactory**, a fully automated framework that addresses the challenges of equipping LLMs with tool-use capabilities via Agentic Reinforcement Learning (Agentic RL). EnvFactory autonomously explores and verifies stateful, executable tool environments from authentic resources, and synthesizes natural multi-turn trajectories through topology-aware sampling and calibrated refinement, producing grounded queries with implicit intents.
|
| 58 |
+
|
| 59 |
+
This model is the official **EnvFactory-8B** trained from [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) using SFT and RL on synthesized tool-use trajectories.
|
| 60 |
+
|
| 61 |
+
## Key Features
|
| 62 |
+
|
| 63 |
+
- **Executable Environment Synthesis**: Automatically discovers, validates, and deploys MCP-based tool environments from real-world APIs
|
| 64 |
+
- **Topology-Aware Trajectory Sampling**: Generates natural multi-turn tool-use trajectories that capture implicit human reasoning
|
| 65 |
+
- **Robust RL Training**: Uses verified environments and calibrated refinement for stable reinforcement learning
|
| 66 |
+
- **Scalable Architecture**: Achieves superior performance with significantly fewer environments (85 environments across 7 domains)
|
| 67 |
+
|
| 68 |
+
## Training Details
|
| 69 |
+
|
| 70 |
+
### Training Data
|
| 71 |
+
|
| 72 |
+
- **SFT Data**: [LARK-Lab/EnvFactory-SFT-FILTERED](https://huggingface.co/datasets/LARK-Lab/EnvFactory-SFT-FILTERED) - 53.4k filtered trajectories
|
| 73 |
+
- **RL Data**: [LARK-Lab/EnvFactory-RL](https://huggingface.co/datasets/LARK-Lab/EnvFactory-RL) - 3.09k trajectories
|
| 74 |
+
|
| 75 |
+
### Training Procedure
|
| 76 |
+
|
| 77 |
+
- **SFT Stage**: Full fine-tuning using LlamaFactory with DeepSpeed ZeRO-3
|
| 78 |
+
- **RL Stage**: Reinforcement learning using forked VeRL framework
|
| 79 |
+
- **Base Model**: Qwen/Qwen3-8B
|
| 80 |
+
- **Training Epochs**: 1 epoch for SFT
|
| 81 |
+
- **Learning Rate**: 1.0e-6 with cosine scheduler
|
| 82 |
+
- **Batch Size**: 1 per device with gradient accumulation of 32
|
| 83 |
+
|
| 84 |
+
## Performance
|
| 85 |
+
|
| 86 |
+
Results on tool-use benchmarks compared to the base model:
|
| 87 |
+
|
| 88 |
+
| Model | BFCL Single Turn | BFCL Multi Turn | MCP-Atlas Pass Rate | MCP-Atlas Mean Cov. | τ²-Bench Avg. | VitaBench Avg. | Overall Avg. |
|
| 89 |
+
|-------|------------------|-----------------|---------------------|---------------------|---------------|----------------|--------------|
|
| 90 |
+
| Qwen3-8B (Base) | 84.31 | 41.25 | 5.15 | 14.86 | 32.30 | 16.70 | 29.23 |
|
| 91 |
+
| **EnvFactory-8B** | 86.02 | 49.00 | 13.75 | 25.98 | 33.67 | 18.67 | 33.40 |
|
| 92 |
+
|
| 93 |
+
## Usage
|
| 94 |
+
|
| 95 |
+
### Tool-Use Agent
|
| 96 |
+
|
| 97 |
+
```python
|
| 98 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 99 |
+
import torch
|
| 100 |
+
|
| 101 |
+
model_path = "LARK-Lab/EnvFactory-8B"
|
| 102 |
+
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
| 103 |
+
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, device_map="auto")
|
| 104 |
+
|
| 105 |
+
# Example tool-use conversation
|
| 106 |
+
messages = [
|
| 107 |
+
{"role": "system", "content": "You are a helpful assistant with access to various tools."},
|
| 108 |
+
{"role": "user", "content": "Search for recent papers about tool-use agents on arxiv."}
|
| 109 |
+
]
|
| 110 |
+
|
| 111 |
+
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt").to(model.device)
|
| 112 |
+
outputs = model.generate(input_ids, max_new_tokens=1024, temperature=0.7, top_p=0.9)
|
| 113 |
+
response = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True)
|
| 114 |
+
print(response)
|
| 115 |
+
```
|
| 116 |
+
|
| 117 |
+
### With MCP Tools
|
| 118 |
+
|
| 119 |
+
```python
|
| 120 |
+
# Load MCP tool configuration
|
| 121 |
+
import json
|
| 122 |
+
|
| 123 |
+
with open("configs/mcp_server.json", "r") as f:
|
| 124 |
+
mcp_config = json.load(f)
|
| 125 |
+
|
| 126 |
+
# Use with your preferred MCP client
|
| 127 |
+
# See https://github.com/LARK-AI-Lab/EnvFactory for integration details
|
| 128 |
+
```
|
| 129 |
+
|
| 130 |
+
## Citation
|
| 131 |
+
|
| 132 |
+
If you find our work helpful, please consider citing:
|
| 133 |
+
|
| 134 |
+
```bibtex
|
| 135 |
+
@misc{xu2026envfactoryscalingtooluseagents,
|
| 136 |
+
title={EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL},
|
| 137 |
+
author={Minrui Xu and Zilin Wang and Mengyi DENG and Zhiwei Li and Zhicheng Yang and Xiao Zhu and Yinhong Liu and Boyu Zhu and Baiyu Huang and Chao Chen and Heyuan Deng and Fei Mi and Lifeng Shang and Xingshan Zeng and Zhijiang Guo},
|
| 138 |
+
year={2026},
|
| 139 |
+
eprint={2605.18703},
|
| 140 |
+
archivePrefix={arXiv},
|
| 141 |
+
primaryClass={cs.CL},
|
| 142 |
+
url={https://arxiv.org/abs/2605.18703},
|
| 143 |
+
}
|
| 144 |
+
```
|
| 145 |
+
|
| 146 |
+
## License
|
| 147 |
+
|
| 148 |
+
This model is released under the Apache 2.0 License.
|