TIGER-OM / README.md

Update README.md

c33e658 verified about 4 hours ago

4.75 kB

	---
	license: mit
	language:
	- en
	- hi
	base_model:
	- mistralai/Mistral-7B-Instruct-v0.3
	tags:
	- agent
	- Qwen
	- AI
	- ST-X-0
	- MIXTRAL
	- TIGER OM
	library_name: transformers
	inference:
	parameters:
	temperature: 0.7
	max_new_tokens: 500
	widget:
	- text: "What are the latest trends in retrieval-augmented generation?"
	example_title: "General Query"
	---

	---
	# 🚀 TIGER-OM (SKT-OM) - 13B MoE Agentic Model

	Advanced 13B Mixture-of-Experts (MoE) Model optimized for Agentic RAG with Think Mode & Plugin Architecture.

	Built for AMD Developer Hackathon 2026 using AMD Developer Cloud.

	---

	## 📊 Model Details

	- Model Name: TIGER-OM (SKT-OM)
	- Architecture: Mixture of Experts (MoE)
	- Total Parameters: 13B (Active parameters much lower due to MoE sparsity)
	- Base Models:
	- Primary Base: Shrijanagain/ST-X-0
	- Expert Integration: Mistral-7B
	- Format: Safetensors (Safe & Fast loading)
	- Quantization: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
	- Context Length: 8192 tokens
	- Training Hardware: AMD Developer Cloud GPUs ($100 developer credits)
	- Inference Optimized: ROCm 7.0 + vLLM + AMD MI300X

	---

	## 🌟 Key Features

	- True MoE Architecture — Sparse activation for better efficiency and performance
	- Think Mode Reasoning — Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
	- Dynamic Plugin System — Intelligent routing to Code, Math, Search, Data Analysis plugins
	- Agentic Capabilities — Full LangGraph multi-agent workflow
	- Advanced RAG Integration — SKT RAG + Query Rewriting + Multi-hop + Reranking
	- Stateful Memory — Persistent conversation context

	---

	## 🏗️ Architecture Breakdown

	TIGER-OM is built on a 13B MoE backbone:

	- Base: Shrijanagain/ST-X-0 (strong foundational model)
	- Experts: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
	- Router Network: Learned gating mechanism for expert selection
	- Think Mode Layer: Custom system prompt + reasoning controller
	- Plugin Head: Tool calling & execution layer

	This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.

	---

	## 📁 Files in this Repo (Safetensors)

	- `model-00001-of-0000X.safetensors` → Main model weights
	- `config.json`
	- `tokenizer.json` / `tokenizer_config.json`
	- `generation_config.json`
	- `special_tokens_map.json`
	- `model.safetensors.index.json`

	All weights are in safe `safetensors` format — No pickle risk.

	---

	## 🚀 How to Use (Safetensors)

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model_name = "Shrijanagain/TIGER-OM"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	)

	prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
	User Query: Calculate training cost comparison and suggest best option..."""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	outputs = model.generate(
	**inputs,
	max_new_tokens=1024,
	temperature=0.7,
	top_p=0.9,
	do_sample=True,
	repetition_penalty=1.1
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## 🔗 Important Links

	- Live Demo: [SKT-OM Space](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM)
	- GGUF Quantized (Q4_K_M): [Shrijanagain/TIGER-GGUF](https://huggingface.co/Shrijanagain/TIGER-GGUF)
	- GitHub (RAG + ADK Code): [SHRIJANAGAIN/SKT-AMD-FILES](https://github.com/SHRIJANAGAIN/SKT-AMD-FILES)

	---

	## 🛠️ Technologies & Stack

	- Base Models: Shrijanagain/ST-X-0 + Mistral-7B Experts
	- RAG: SKT RAG + AMD ADK Kit
	- Agents: LangGraph
	- Hardware: AMD MI300X + ROCm 7.0
	- Inference: vLLM (FP16) + transformers (Safetensors)
	- Training: AMD Developer Cloud

	---

	## ⚡ Performance

	- Excellent balance of quality vs efficiency due to MoE architecture
	- Strong performance on reasoning, tool-use, code, and multi-step tasks
	- Significantly lower inference cost compared to dense 13B+ models

	---

	## 📌 Use Cases

	- Complex technical Q&A
	- Agentic workflows & tool calling
	- Research assistance
	- Code generation & debugging
	- Mathematical & logical reasoning
	- Comparative analysis
	- Data analysis with plugins

	---

	## 🏆 Hackathon

	AMD Developer Hackathon 2026
	Trained entirely on AMD Developer Cloud
	Fully built in public with multiple technical updates.

	---

	## 📄 License

	MIT License

	---