Shrijanagain
/

TIGER-OM

+---
+license: mit
+language:
+- en
+- hi
+base_model:
+- mistralai/Mistral-7B-Instruct-v0.3
+- meta-llama/Llama-3.2-1B-Instruct
+pipeline_tag: text-generation
+library_name: transformers
+tags:
+- agent
+- AI
+- ST-X-0
+- MIXTRAL
+- TIGER OM
+---
+# 🚀 TIGER-OM (SKT-OM) - 13B MoE Agentic Model
+**Advanced 13B Mixture-of-Experts (MoE) Model** optimized for Agentic RAG with Think Mode & Plugin Architecture.
+Built for **AMD Developer Hackathon 2026** using AMD Developer Cloud.
+---
+## 📊 Model Details
+- **Model Name**: TIGER-OM (SKT-OM)
+- **Architecture**: **Mixture of Experts (MoE)**
+- **Total Parameters**: 13B (Active parameters much lower due to MoE sparsity)
+- **Base Models**:
+  - Primary Base: **Shrijanagain/ST-X-0**
+  - Expert Integration: **Mistral-7B**
+- **Format**: **Safetensors** (Safe & Fast loading)
+- **Quantization**: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
+- **Context Length**: 8192 tokens
+- **Training Hardware**: AMD Developer Cloud GPUs ($100 developer credits)
+- **Inference Optimized**: ROCm 7.0 + vLLM + AMD MI300X
+---
+## 🌟 Key Features
+- **True MoE Architecture** — Sparse activation for better efficiency and performance
+- **Think Mode Reasoning** — Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
+- **Dynamic Plugin System** — Intelligent routing to Code, Math, Search, Data Analysis plugins
+- **Agentic Capabilities** — Full LangGraph multi-agent workflow
+- **Advanced RAG Integration** — SKT RAG + Query Rewriting + Multi-hop + Reranking
+- **Stateful Memory** — Persistent conversation context
+---
+## 🏗️ Architecture Breakdown
+**TIGER-OM** is built on a **13B MoE** backbone:
+- **Base**: Shrijanagain/ST-X-0 (strong foundational model)
+- **Experts**: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
+- **Router Network**: Learned gating mechanism for expert selection
+- **Think Mode Layer**: Custom system prompt + reasoning controller
+- **Plugin Head**: Tool calling & execution layer
+This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.
+---
+## 📁 Files in this Repo (Safetensors)
+- `model-00001-of-0000X.safetensors` → Main model weights
+- `config.json`
+- `tokenizer.json` / `tokenizer_config.json`
+- `generation_config.json`
+- `special_tokens_map.json`
+- `model.safetensors.index.json`
+**All weights are in safe `safetensors` format** — No pickle risk.
+---
+## 🚀 How to Use (Safetensors)
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_name = "Shrijanagain/TIGER-OM"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True
+)
+prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
+User Query: Calculate training cost comparison and suggest best option..."""
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=1024,
+    temperature=0.7,
+    top_p=0.9,
+    do_sample=True,
+    repetition_penalty=1.1
+)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+---
+## 🔗 Important Links
+- **Live Demo**: [SKT-OM Space](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM)
+- **GGUF Quantized (Q4_K_M)**: [Shrijanagain/TIGER-GGUF](https://huggingface.co/Shrijanagain/TIGER-GGUF)
+- **GitHub (RAG + ADK Code)**: [SHRIJANAGAIN/SKT-AMD-FILES](https://github.com/SHRIJANAGAIN/SKT-AMD-FILES)
+---
+## 🛠️ Technologies & Stack
+- **Base Models**: Shrijanagain/ST-X-0 + Mistral-7B Experts
+- **RAG**: SKT RAG + AMD ADK Kit
+- **Agents**: LangGraph
+- **Hardware**: AMD MI300X + ROCm 7.0
+- **Inference**: vLLM (FP16) + transformers (Safetensors)
+- **Training**: AMD Developer Cloud
+---
+## ⚡ Performance
+- Excellent balance of **quality vs efficiency** due to MoE architecture
+- Strong performance on reasoning, tool-use, code, and multi-step tasks
+- Significantly lower inference cost compared to dense 13B+ models
+---
+## 📌 Use Cases
+- Complex technical Q&A
+- Agentic workflows & tool calling
+- Research assistance
+- Code generation & debugging
+- Mathematical & logical reasoning
+- Comparative analysis
+- Data analysis with plugins
+---
+## 🏆 Hackathon
+**AMD Developer Hackathon 2026**
+Trained entirely on **AMD Developer Cloud**
+Fully built in public with multiple technical updates.
+---
+## 📄 License
+MIT License
+---