File size: 4,747 Bytes
ac4423f
 
 
 
 
 
 
 
 
e964992
ac4423f
 
 
 
d788481
c33e658
 
 
 
 
 
 
d788481
 
ac4423f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
---
license: mit
language:
- en
- hi
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
tags:
- agent
- Qwen
- AI
- ST-X-0
- MIXTRAL
- TIGER OM
library_name: transformers
inference:
  parameters:
    temperature: 0.7
    max_new_tokens: 500
widget:
  - text: "What are the latest trends in retrieval-augmented generation?"
    example_title: "General Query"
---

---
# πŸš€ TIGER-OM (SKT-OM) - 13B MoE Agentic Model

**Advanced 13B Mixture-of-Experts (MoE) Model** optimized for Agentic RAG with Think Mode & Plugin Architecture.

Built for **AMD Developer Hackathon 2026** using AMD Developer Cloud.

---

## πŸ“Š Model Details

- **Model Name**: TIGER-OM (SKT-OM)
- **Architecture**: **Mixture of Experts (MoE)**
- **Total Parameters**: 13B (Active parameters much lower due to MoE sparsity)
- **Base Models**: 
  - Primary Base: **Shrijanagain/ST-X-0**
  - Expert Integration: **Mistral-7B**
- **Format**: **Safetensors** (Safe & Fast loading)
- **Quantization**: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
- **Context Length**: 8192 tokens
- **Training Hardware**: AMD Developer Cloud GPUs ($100 developer credits)
- **Inference Optimized**: ROCm 7.0 + vLLM + AMD MI300X

---

## 🌟 Key Features

- **True MoE Architecture** β€” Sparse activation for better efficiency and performance
- **Think Mode Reasoning** β€” Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
- **Dynamic Plugin System** β€” Intelligent routing to Code, Math, Search, Data Analysis plugins
- **Agentic Capabilities** β€” Full LangGraph multi-agent workflow
- **Advanced RAG Integration** β€” SKT RAG + Query Rewriting + Multi-hop + Reranking
- **Stateful Memory** β€” Persistent conversation context

---

## πŸ—οΈ Architecture Breakdown

**TIGER-OM** is built on a **13B MoE** backbone:

- **Base**: Shrijanagain/ST-X-0 (strong foundational model)
- **Experts**: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
- **Router Network**: Learned gating mechanism for expert selection
- **Think Mode Layer**: Custom system prompt + reasoning controller
- **Plugin Head**: Tool calling & execution layer

This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.

---

## πŸ“ Files in this Repo (Safetensors)

- `model-00001-of-0000X.safetensors` β†’ Main model weights
- `config.json`
- `tokenizer.json` / `tokenizer_config.json`
- `generation_config.json`
- `special_tokens_map.json`
- `model.safetensors.index.json`

**All weights are in safe `safetensors` format** β€” No pickle risk.

---

## πŸš€ How to Use (Safetensors)

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "Shrijanagain/TIGER-OM"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
User Query: Calculate training cost comparison and suggest best option..."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

---

## πŸ”— Important Links

- **Live Demo**: [SKT-OM Space](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM)
- **GGUF Quantized (Q4_K_M)**: [Shrijanagain/TIGER-GGUF](https://huggingface.co/Shrijanagain/TIGER-GGUF)
- **GitHub (RAG + ADK Code)**: [SHRIJANAGAIN/SKT-AMD-FILES](https://github.com/SHRIJANAGAIN/SKT-AMD-FILES)

---

## πŸ› οΈ Technologies & Stack

- **Base Models**: Shrijanagain/ST-X-0 + Mistral-7B Experts
- **RAG**: SKT RAG + AMD ADK Kit
- **Agents**: LangGraph
- **Hardware**: AMD MI300X + ROCm 7.0
- **Inference**: vLLM (FP16) + transformers (Safetensors)
- **Training**: AMD Developer Cloud

---

## ⚑ Performance

- Excellent balance of **quality vs efficiency** due to MoE architecture
- Strong performance on reasoning, tool-use, code, and multi-step tasks
- Significantly lower inference cost compared to dense 13B+ models

---

## πŸ“Œ Use Cases

- Complex technical Q&A
- Agentic workflows & tool calling
- Research assistance
- Code generation & debugging
- Mathematical & logical reasoning
- Comparative analysis
- Data analysis with plugins

---

## πŸ† Hackathon

**AMD Developer Hackathon 2026**  
Trained entirely on **AMD Developer Cloud**  
Fully built in public with multiple technical updates.

---

## πŸ“„ License

MIT License

---