TIGER-GGUF / README.md
Shrijanagain's picture
Update README.md
fb69690 verified
---
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
tags:
- llama-cpp
license: mit
datasets:
- SKT-NRS/SKT-OMNI-CORPUS-146T-V1
language:
- en
- hi
pipeline_tag: text-generation
library_name: transformers
---
# πŸš€ SKT-OM (TIGER-OM) - Agentic RAG System
**Advanced 13B Agentic RAG with Think Mode + Dynamic Plugins + LangGraph**
Built for **AMD Developer Hackathon 2026** on AMD Developer Cloud.
---
## 🌟 Project Overview
**SKT-OM** (also known as **TIGER-OM**) is a powerful **13B parameter fully agentic Retrieval-Augmented Generation (RAG)** system. It goes far beyond traditional RAG by integrating:
- **Think Mode** β€” Advanced multi-step reasoning engine
- **Dynamic Plugin Architecture** β€” Intelligent tool selection & execution
- **LangGraph Multi-Agent Workflow** β€” Stateful agent collaboration
- **SKT RAG** β€” High-performance retrieval pipeline
The system takes natural language queries and returns intelligent, reasoned, and accurate responses with tool usage and verification.
---
## πŸ“Š Model Details
- **Model Name**: TIGER-OM (SKT-OM)
- **Parameters**: 13 Billion
- **Base Model**: Custom trained on AMD hardware
- **Quantization**: **Q4_K_M** (Excellent balance between quality and size)
- **GGUF Format**: Optimized for CPU + GPU inference
- **Training Hardware**: AMD Developer Cloud GPUs ($100 credits)
- **Inference**: ROCm 7.0 + vLLM (Full FP16) + GGUF (Q4_K_M)
**Q4_K_M Version** provides near FP16 level reasoning quality while being much more memory efficient and faster on consumer/pro hardware.
---
## ✨ Key Features
- **Think Mode Engine**: Chain-of-Thought, Self-Reflection, Verification Loops, and Self-Critique
- **Plugin Ecosystem**: Code Runner, Math Solver, Web Search, Data Analyzer, Document Parser + Custom Plugins
- **Advanced RAG**: SKT RAG with query rewriting, multi-hop retrieval, reranking & contextual compression
- **Multi-Agent System**: LangGraph powered stateful workflow
- **Memory**: Persistent conversation state
- **Tool Use**: Dynamic plugin routing based on query intent
---
## πŸ”— Important Links
- **Live Demo**: [https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM)
- **Main Model Repo**: [Shrijanagain/TIGER-OM](https://huggingface.co/Shrijanagain/TIGER-OM)
- **GGUF Quantized Models (Q4_K_M)**: [Shrijanagain/TIGER-GGUF](https://huggingface.co/Shrijanagain/TIGER-GGUF)
- **GitHub Repository (RAG + ADK)**: [https://github.com/SHRIJANAGAIN/SKT-AMD-FILES](https://github.com/SHRIJANAGAIN/SKT-AMD-FILES)
---
## How It Works
```mermaid
graph TD
A[User Query] --> B[Think Mode]
B --> C[Decomposition & Planning]
C --> D[Plugin Router]
C --> E[SKT RAG Retrieval]
D --> F[Execute Plugins]
E --> G[Context Processing]
F & G --> H[Verification Loop]
H --> I[LangGraph Synthesis]
I --> J[Final Response]
```
---
## πŸ› οΈ Technologies Used
- **LLM**: 13B TIGER-OM (Q4_K_M GGUF)
- **RAG Framework**: SKT RAG + ADK Kit
- **Agent Framework**: LangGraph
- **GPU Stack**: ROCm 7.0 + AMD ADK Kit
- **Inference**: vLLM (FP16) + llama.cpp (GGUF Q4_K_M)
- **Hardware**: AMD MI300X
- **Cloud**: AMD Developer Cloud
---
## πŸš€ Quick Start - GGUF Q4_K_M
```bash
# Using llama.cpp
./llama-cli \
-m tiger-om-q4_k_m.gguf \
-p "Your complex query here..." \
-n 1024 \
-t 8 \
--temp 0.7
```
**Python Example (llama-cpp-python)**
```python
from llama_cpp import Llama
llm = Llama(
model_path="tiger-om-q4_k_m.gguf",
n_gpu_layers=-1, # Use all GPU layers
n_ctx=8192,
verbose=False
)
response = llm.create_chat_completion(
messages=[{"role": "user", "content": "Explain..."}],
temperature=0.7,
max_tokens=1024
)
print(response['choices'][0]['message']['content'])
```
---
## πŸ“ Repository Structure
- `/skt_ai_labs` β€” Core ADK + RAG integration
- `/plugins` β€” Plugin system
- `/agents` β€” LangGraph workflows
- `/examples` β€” Ready-to-use examples
- `/docs` β€” Architecture & guides
---
## πŸ† Hackathon Information
- **Event**: AMD Developer Hackathon 2026
- **Trained on**: AMD Developer Cloud ($100 credits)
- **Built in Public**: Regular technical updates shared
- **Goal**: Showcasing powerful agentic AI on AMD ROCm ecosystem
---
## πŸ“„ License
*MIT*