File size: 4,323 Bytes
2d19e5f
fb69690
 
2d19e5f
 
fb69690
 
 
 
 
 
 
 
2d19e5f
fb69690
2d19e5f
fb69690
2d19e5f
fb69690
2d19e5f
fb69690
2d19e5f
fb69690
2d19e5f
fb69690
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2d19e5f
 
fb69690
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2d19e5f
fb69690
 
 
 
 
 
 
2d19e5f
 
fb69690
2d19e5f
fb69690
 
2d19e5f
fb69690
 
 
 
 
 
2d19e5f
fb69690
 
 
 
 
 
 
2d19e5f
fb69690
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
---
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
tags:
- llama-cpp
license: mit
datasets:
- SKT-NRS/SKT-OMNI-CORPUS-146T-V1
language:
- en
- hi
pipeline_tag: text-generation
library_name: transformers
---
# πŸš€ SKT-OM (TIGER-OM) - Agentic RAG System

**Advanced 13B Agentic RAG with Think Mode + Dynamic Plugins + LangGraph**

Built for **AMD Developer Hackathon 2026** on AMD Developer Cloud.

---

## 🌟 Project Overview

**SKT-OM** (also known as **TIGER-OM**) is a powerful **13B parameter fully agentic Retrieval-Augmented Generation (RAG)** system. It goes far beyond traditional RAG by integrating:

- **Think Mode** β€” Advanced multi-step reasoning engine
- **Dynamic Plugin Architecture** β€” Intelligent tool selection & execution
- **LangGraph Multi-Agent Workflow** β€” Stateful agent collaboration
- **SKT RAG** β€” High-performance retrieval pipeline

The system takes natural language queries and returns intelligent, reasoned, and accurate responses with tool usage and verification.

---

## πŸ“Š Model Details

- **Model Name**: TIGER-OM (SKT-OM)
- **Parameters**: 13 Billion
- **Base Model**: Custom trained on AMD hardware
- **Quantization**: **Q4_K_M** (Excellent balance between quality and size)
- **GGUF Format**: Optimized for CPU + GPU inference
- **Training Hardware**: AMD Developer Cloud GPUs ($100 credits)
- **Inference**: ROCm 7.0 + vLLM (Full FP16) + GGUF (Q4_K_M)

**Q4_K_M Version** provides near FP16 level reasoning quality while being much more memory efficient and faster on consumer/pro hardware.

---

## ✨ Key Features

- **Think Mode Engine**: Chain-of-Thought, Self-Reflection, Verification Loops, and Self-Critique
- **Plugin Ecosystem**: Code Runner, Math Solver, Web Search, Data Analyzer, Document Parser + Custom Plugins
- **Advanced RAG**: SKT RAG with query rewriting, multi-hop retrieval, reranking & contextual compression
- **Multi-Agent System**: LangGraph powered stateful workflow
- **Memory**: Persistent conversation state
- **Tool Use**: Dynamic plugin routing based on query intent

---

## πŸ”— Important Links

- **Live Demo**: [https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM)
- **Main Model Repo**: [Shrijanagain/TIGER-OM](https://huggingface.co/Shrijanagain/TIGER-OM)
- **GGUF Quantized Models (Q4_K_M)**: [Shrijanagain/TIGER-GGUF](https://huggingface.co/Shrijanagain/TIGER-GGUF)
- **GitHub Repository (RAG + ADK)**: [https://github.com/SHRIJANAGAIN/SKT-AMD-FILES](https://github.com/SHRIJANAGAIN/SKT-AMD-FILES)

---

## How It Works

```mermaid
graph TD
    A[User Query] --> B[Think Mode]
    B --> C[Decomposition & Planning]
    C --> D[Plugin Router]
    C --> E[SKT RAG Retrieval]
    D --> F[Execute Plugins]
    E --> G[Context Processing]
    F & G --> H[Verification Loop]
    H --> I[LangGraph Synthesis]
    I --> J[Final Response]
```

---

## πŸ› οΈ Technologies Used

- **LLM**: 13B TIGER-OM (Q4_K_M GGUF)
- **RAG Framework**: SKT RAG + ADK Kit
- **Agent Framework**: LangGraph
- **GPU Stack**: ROCm 7.0 + AMD ADK Kit
- **Inference**: vLLM (FP16) + llama.cpp (GGUF Q4_K_M)
- **Hardware**: AMD MI300X
- **Cloud**: AMD Developer Cloud

---

## πŸš€ Quick Start - GGUF Q4_K_M

```bash
# Using llama.cpp
./llama-cli \
  -m tiger-om-q4_k_m.gguf \
  -p "Your complex query here..." \
  -n 1024 \
  -t 8 \
  --temp 0.7
```

**Python Example (llama-cpp-python)**

```python
from llama_cpp import Llama

llm = Llama(
    model_path="tiger-om-q4_k_m.gguf",
    n_gpu_layers=-1,      # Use all GPU layers
    n_ctx=8192,
    verbose=False
)

response = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Explain..."}],
    temperature=0.7,
    max_tokens=1024
)

print(response['choices'][0]['message']['content'])
```

---

## πŸ“ Repository Structure

- `/skt_ai_labs` β€” Core ADK + RAG integration
- `/plugins` β€” Plugin system
- `/agents` β€” LangGraph workflows
- `/examples` β€” Ready-to-use examples
- `/docs` β€” Architecture & guides

---

## πŸ† Hackathon Information

- **Event**: AMD Developer Hackathon 2026
- **Trained on**: AMD Developer Cloud ($100 credits)
- **Built in Public**: Regular technical updates shared
- **Goal**: Showcasing powerful agentic AI on AMD ROCm ecosystem

---

## πŸ“„ License

*MIT*