--- license: mit language: - en - hi base_model: - mistralai/Mistral-7B-Instruct-v0.3 tags: - agent - Qwen - AI - ST-X-0 - MIXTRAL - TIGER OM library_name: transformers inference: parameters: temperature: 0.7 max_new_tokens: 500 widget: - text: "What are the latest trends in retrieval-augmented generation?" example_title: "General Query" --- --- # 🚀 TIGER-OM (SKT-OM) - 13B MoE Agentic Model **Advanced 13B Mixture-of-Experts (MoE) Model** optimized for Agentic RAG with Think Mode & Plugin Architecture. Built for **AMD Developer Hackathon 2026** using AMD Developer Cloud. --- ## 📊 Model Details - **Model Name**: TIGER-OM (SKT-OM) - **Architecture**: **Mixture of Experts (MoE)** - **Total Parameters**: 13B (Active parameters much lower due to MoE sparsity) - **Base Models**: - Primary Base: **Shrijanagain/ST-X-0** - Expert Integration: **Mistral-7B** - **Format**: **Safetensors** (Safe & Fast loading) - **Quantization**: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo - **Context Length**: 8192 tokens - **Training Hardware**: AMD Developer Cloud GPUs ($100 developer credits) - **Inference Optimized**: ROCm 7.0 + vLLM + AMD MI300X --- ## 🌟 Key Features - **True MoE Architecture** — Sparse activation for better efficiency and performance - **Think Mode Reasoning** — Advanced Chain-of-Thought, Planning, Self-Reflection & Verification - **Dynamic Plugin System** — Intelligent routing to Code, Math, Search, Data Analysis plugins - **Agentic Capabilities** — Full LangGraph multi-agent workflow - **Advanced RAG Integration** — SKT RAG + Query Rewriting + Multi-hop + Reranking - **Stateful Memory** — Persistent conversation context --- ## 🏗️ Architecture Breakdown **TIGER-OM** is built on a **13B MoE** backbone: - **Base**: Shrijanagain/ST-X-0 (strong foundational model) - **Experts**: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities - **Router Network**: Learned gating mechanism for expert selection - **Think Mode Layer**: Custom system prompt + reasoning controller - **Plugin Head**: Tool calling & execution layer This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency. --- ## 📁 Files in this Repo (Safetensors) - `model-00001-of-0000X.safetensors` → Main model weights - `config.json` - `tokenizer.json` / `tokenizer_config.json` - `generation_config.json` - `special_tokens_map.json` - `model.safetensors.index.json` **All weights are in safe `safetensors` format** — No pickle risk. --- ## 🚀 How to Use (Safetensors) ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_name = "Shrijanagain/TIGER-OM" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled. User Query: Calculate training cost comparison and suggest best option...""" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate( **inputs, max_new_tokens=1024, temperature=0.7, top_p=0.9, do_sample=True, repetition_penalty=1.1 ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## 🔗 Important Links - **Live Demo**: [SKT-OM Space](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM) - **GGUF Quantized (Q4_K_M)**: [Shrijanagain/TIGER-GGUF](https://huggingface.co/Shrijanagain/TIGER-GGUF) - **GitHub (RAG + ADK Code)**: [SHRIJANAGAIN/SKT-AMD-FILES](https://github.com/SHRIJANAGAIN/SKT-AMD-FILES) --- ## 🛠️ Technologies & Stack - **Base Models**: Shrijanagain/ST-X-0 + Mistral-7B Experts - **RAG**: SKT RAG + AMD ADK Kit - **Agents**: LangGraph - **Hardware**: AMD MI300X + ROCm 7.0 - **Inference**: vLLM (FP16) + transformers (Safetensors) - **Training**: AMD Developer Cloud --- ## ⚡ Performance - Excellent balance of **quality vs efficiency** due to MoE architecture - Strong performance on reasoning, tool-use, code, and multi-step tasks - Significantly lower inference cost compared to dense 13B+ models --- ## 📌 Use Cases - Complex technical Q&A - Agentic workflows & tool calling - Research assistance - Code generation & debugging - Mathematical & logical reasoning - Comparative analysis - Data analysis with plugins --- ## 🏆 Hackathon **AMD Developer Hackathon 2026** Trained entirely on **AMD Developer Cloud** Fully built in public with multiple technical updates. --- ## 📄 License MIT License ---