YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.


library_name: transformers license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE pipeline_tag: text-generation base_model:

  • Qwen/Qwen3-8B-Base

ToolOrchestra β€” agentic/ToolOrchestra/

Our code hub is:https://github.com/LMIS-ORG/slime-agentic/tree/main

Reproduces the core idea of ToolOrchestra: an Orchestrator-Expert multi-agent framework for RL training. A central Orchestrator LLM learns to route tasks to the best specialized expert model and the corresponding tools through multi-turn tool calls. GRPO is applied to the Orchestrator's decision trajectory, enabling it to improve tool-use and routing capabilities without manually annotated intermediate steps.

Architecture

Input question
  β”‚
  β–Ό
Orchestrator LLM                        ← Decide which tool to call (loss_mask=1)
  β”‚
  └─► for turn in range(max_turns):
        β”‚
        β”œβ”€ parse_tool_call()            ← Parse <tool_call> from model output
        β”‚
        β”œβ”€ tool call                    ← Call retrieval / external tool (loss_mask=0)
        β”‚    └─ FAISS retrieval service (port 8000)
        β”‚
        β”œβ”€ call_expert ──────────────► Expert LLM routing (loss_mask=0)
        β”‚                               └─ specialist models on separate ports
        β”‚
        └─ answer ──────────────────► Final answer β†’ stop loop
  β”‚
  β–Ό
GenerationOutput
  - token_ids + log_probs  (all turns concatenated)
  - loss_mask: Orchestrator output = 1 / tool result = 0

Results

Model Dataset Baseline (Qwen3-8B) ToolOrchestra (Ours) Improvement
Qwen3-8B τ²-Bench 0.278 0.388 +0.110
Downloads last month
219
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for LMIS-ORG/ToolOrchestra_Slime_Agentic_Qwen3_8B