Shrijanagain commited on
Commit
ac4423f
Β·
verified Β·
1 Parent(s): 1c97abc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +166 -0
README.md ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ - hi
6
+ base_model:
7
+ - mistralai/Mistral-7B-Instruct-v0.3
8
+ - meta-llama/Llama-3.2-1B-Instruct
9
+ pipeline_tag: text-generation
10
+ library_name: transformers
11
+ tags:
12
+ - agent
13
+ - AI
14
+ - ST-X-0
15
+ - MIXTRAL
16
+ - TIGER OM
17
+ ---
18
+ # πŸš€ TIGER-OM (SKT-OM) - 13B MoE Agentic Model
19
+
20
+ **Advanced 13B Mixture-of-Experts (MoE) Model** optimized for Agentic RAG with Think Mode & Plugin Architecture.
21
+
22
+ Built for **AMD Developer Hackathon 2026** using AMD Developer Cloud.
23
+
24
+ ---
25
+
26
+ ## πŸ“Š Model Details
27
+
28
+ - **Model Name**: TIGER-OM (SKT-OM)
29
+ - **Architecture**: **Mixture of Experts (MoE)**
30
+ - **Total Parameters**: 13B (Active parameters much lower due to MoE sparsity)
31
+ - **Base Models**:
32
+ - Primary Base: **Shrijanagain/ST-X-0**
33
+ - Expert Integration: **Mistral-7B**
34
+ - **Format**: **Safetensors** (Safe & Fast loading)
35
+ - **Quantization**: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
36
+ - **Context Length**: 8192 tokens
37
+ - **Training Hardware**: AMD Developer Cloud GPUs ($100 developer credits)
38
+ - **Inference Optimized**: ROCm 7.0 + vLLM + AMD MI300X
39
+
40
+ ---
41
+
42
+ ## 🌟 Key Features
43
+
44
+ - **True MoE Architecture** β€” Sparse activation for better efficiency and performance
45
+ - **Think Mode Reasoning** β€” Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
46
+ - **Dynamic Plugin System** β€” Intelligent routing to Code, Math, Search, Data Analysis plugins
47
+ - **Agentic Capabilities** β€” Full LangGraph multi-agent workflow
48
+ - **Advanced RAG Integration** β€” SKT RAG + Query Rewriting + Multi-hop + Reranking
49
+ - **Stateful Memory** β€” Persistent conversation context
50
+
51
+ ---
52
+
53
+ ## πŸ—οΈ Architecture Breakdown
54
+
55
+ **TIGER-OM** is built on a **13B MoE** backbone:
56
+
57
+ - **Base**: Shrijanagain/ST-X-0 (strong foundational model)
58
+ - **Experts**: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
59
+ - **Router Network**: Learned gating mechanism for expert selection
60
+ - **Think Mode Layer**: Custom system prompt + reasoning controller
61
+ - **Plugin Head**: Tool calling & execution layer
62
+
63
+ This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.
64
+
65
+ ---
66
+
67
+ ## πŸ“ Files in this Repo (Safetensors)
68
+
69
+ - `model-00001-of-0000X.safetensors` β†’ Main model weights
70
+ - `config.json`
71
+ - `tokenizer.json` / `tokenizer_config.json`
72
+ - `generation_config.json`
73
+ - `special_tokens_map.json`
74
+ - `model.safetensors.index.json`
75
+
76
+ **All weights are in safe `safetensors` format** β€” No pickle risk.
77
+
78
+ ---
79
+
80
+ ## πŸš€ How to Use (Safetensors)
81
+
82
+ ```python
83
+ from transformers import AutoModelForCausalLM, AutoTokenizer
84
+ import torch
85
+
86
+ model_name = "Shrijanagain/TIGER-OM"
87
+
88
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
89
+ model = AutoModelForCausalLM.from_pretrained(
90
+ model_name,
91
+ torch_dtype=torch.bfloat16,
92
+ device_map="auto",
93
+ trust_remote_code=True
94
+ )
95
+
96
+ prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
97
+ User Query: Calculate training cost comparison and suggest best option..."""
98
+
99
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
100
+
101
+ outputs = model.generate(
102
+ **inputs,
103
+ max_new_tokens=1024,
104
+ temperature=0.7,
105
+ top_p=0.9,
106
+ do_sample=True,
107
+ repetition_penalty=1.1
108
+ )
109
+
110
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
111
+ ```
112
+
113
+ ---
114
+
115
+ ## πŸ”— Important Links
116
+
117
+ - **Live Demo**: [SKT-OM Space](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM)
118
+ - **GGUF Quantized (Q4_K_M)**: [Shrijanagain/TIGER-GGUF](https://huggingface.co/Shrijanagain/TIGER-GGUF)
119
+ - **GitHub (RAG + ADK Code)**: [SHRIJANAGAIN/SKT-AMD-FILES](https://github.com/SHRIJANAGAIN/SKT-AMD-FILES)
120
+
121
+ ---
122
+
123
+ ## πŸ› οΈ Technologies & Stack
124
+
125
+ - **Base Models**: Shrijanagain/ST-X-0 + Mistral-7B Experts
126
+ - **RAG**: SKT RAG + AMD ADK Kit
127
+ - **Agents**: LangGraph
128
+ - **Hardware**: AMD MI300X + ROCm 7.0
129
+ - **Inference**: vLLM (FP16) + transformers (Safetensors)
130
+ - **Training**: AMD Developer Cloud
131
+
132
+ ---
133
+
134
+ ## ⚑ Performance
135
+
136
+ - Excellent balance of **quality vs efficiency** due to MoE architecture
137
+ - Strong performance on reasoning, tool-use, code, and multi-step tasks
138
+ - Significantly lower inference cost compared to dense 13B+ models
139
+
140
+ ---
141
+
142
+ ## πŸ“Œ Use Cases
143
+
144
+ - Complex technical Q&A
145
+ - Agentic workflows & tool calling
146
+ - Research assistance
147
+ - Code generation & debugging
148
+ - Mathematical & logical reasoning
149
+ - Comparative analysis
150
+ - Data analysis with plugins
151
+
152
+ ---
153
+
154
+ ## πŸ† Hackathon
155
+
156
+ **AMD Developer Hackathon 2026**
157
+ Trained entirely on **AMD Developer Cloud**
158
+ Fully built in public with multiple technical updates.
159
+
160
+ ---
161
+
162
+ ## πŸ“„ License
163
+
164
+ MIT License
165
+
166
+ ---