Shrijanagain
/

TIGER-GGUF

@@ -1,50 +1,155 @@
 ---
-base_model: Shrijanagain/TIGER-PASS-V1-ARCHIVE
 tags:
 - llama-cpp
-- gguf-my-repo
 ---
-# Shrijanagain/TIGER-PASS-V1-ARCHIVE-Q4_K_M-GGUF
-This model was converted to GGUF format from [`Shrijanagain/TIGER-PASS-V1-ARCHIVE`](https://huggingface.co/Shrijanagain/TIGER-PASS-V1-ARCHIVE) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
-Refer to the [original model card](https://huggingface.co/Shrijanagain/TIGER-PASS-V1-ARCHIVE) for more details on the model.
-## Use with llama.cpp
-Install llama.cpp through brew (works on Mac and Linux)
-```bash
-brew install llama.cpp
-```
-Invoke the llama.cpp server or the CLI.
-### CLI:
-```bash
-llama-cli --hf-repo Shrijanagain/TIGER-PASS-V1-ARCHIVE-Q4_K_M-GGUF --hf-file tiger-pass-v1-archive-q4_k_m.gguf -p "The meaning to life and the universe is"
 ```
-### Server:
 ```bash
-llama-server --hf-repo Shrijanagain/TIGER-PASS-V1-ARCHIVE-Q4_K_M-GGUF --hf-file tiger-pass-v1-archive-q4_k_m.gguf -c 2048
 ```
-Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
-Step 1: Clone llama.cpp from GitHub.
-```
-git clone https://github.com/ggerganov/llama.cpp
-```
-Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
-```
-cd llama.cpp && LLAMA_CURL=1 make
-```
-Step 3: Run inference through the main binary.
-```
-./llama-cli --hf-repo Shrijanagain/TIGER-PASS-V1-ARCHIVE-Q4_K_M-GGUF --hf-file tiger-pass-v1-archive-q4_k_m.gguf -p "The meaning to life and the universe is"
-```
-or
-```
-./llama-server --hf-repo Shrijanagain/TIGER-PASS-V1-ARCHIVE-Q4_K_M-GGUF --hf-file tiger-pass-v1-archive-q4_k_m.gguf -c 2048
 ```

 ---
+base_model:
+- mistralai/Mistral-7B-Instruct-v0.3
 tags:
 - llama-cpp
+license: mit
+datasets:
+- SKT-NRS/SKT-OMNI-CORPUS-146T-V1
+language:
+- en
+- hi
+pipeline_tag: text-generation
+library_name: transformers
 ---
+# 🚀 SKT-OM (TIGER-OM) - Agentic RAG System
+**Advanced 13B Agentic RAG with Think Mode + Dynamic Plugins + LangGraph**
+Built for **AMD Developer Hackathon 2026** on AMD Developer Cloud.
+---
+## 🌟 Project Overview
+**SKT-OM** (also known as **TIGER-OM**) is a powerful **13B parameter fully agentic Retrieval-Augmented Generation (RAG)** system. It goes far beyond traditional RAG by integrating:
+- **Think Mode** — Advanced multi-step reasoning engine
+- **Dynamic Plugin Architecture** — Intelligent tool selection & execution
+- **LangGraph Multi-Agent Workflow** — Stateful agent collaboration
+- **SKT RAG** — High-performance retrieval pipeline
+The system takes natural language queries and returns intelligent, reasoned, and accurate responses with tool usage and verification.
+---
+## 📊 Model Details
+- **Model Name**: TIGER-OM (SKT-OM)
+- **Parameters**: 13 Billion
+- **Base Model**: Custom trained on AMD hardware
+- **Quantization**: **Q4_K_M** (Excellent balance between quality and size)
+- **GGUF Format**: Optimized for CPU + GPU inference
+- **Training Hardware**: AMD Developer Cloud GPUs ($100 credits)
+- **Inference**: ROCm 7.0 + vLLM (Full FP16) + GGUF (Q4_K_M)
+**Q4_K_M Version** provides near FP16 level reasoning quality while being much more memory efficient and faster on consumer/pro hardware.
+---
+## ✨ Key Features
+- **Think Mode Engine**: Chain-of-Thought, Self-Reflection, Verification Loops, and Self-Critique
+- **Plugin Ecosystem**: Code Runner, Math Solver, Web Search, Data Analyzer, Document Parser + Custom Plugins
+- **Advanced RAG**: SKT RAG with query rewriting, multi-hop retrieval, reranking & contextual compression
+- **Multi-Agent System**: LangGraph powered stateful workflow
+- **Memory**: Persistent conversation state
+- **Tool Use**: Dynamic plugin routing based on query intent
+---
+## 🔗 Important Links
+- **Live Demo**: [https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM](https://huggingface.co/spaces/lablab-ai-amd-developer-hackathon/SKT-OM)
+- **Main Model Repo**: [Shrijanagain/TIGER-OM](https://huggingface.co/Shrijanagain/TIGER-OM)
+- **GGUF Quantized Models (Q4_K_M)**: [Shrijanagain/TIGER-GGUF](https://huggingface.co/Shrijanagain/TIGER-GGUF)
+- **GitHub Repository (RAG + ADK)**: [https://github.com/SHRIJANAGAIN/SKT-AMD-FILES](https://github.com/SHRIJANAGAIN/SKT-AMD-FILES)
+---
+## How It Works
+```mermaid
+graph TD
+    A[User Query] --> B[Think Mode]
+    B --> C[Decomposition & Planning]
+    C --> D[Plugin Router]
+    C --> E[SKT RAG Retrieval]
+    D --> F[Execute Plugins]
+    E --> G[Context Processing]
+    F & G --> H[Verification Loop]
+    H --> I[LangGraph Synthesis]
+    I --> J[Final Response]
 ```
+---
+## 🛠️ Technologies Used
+- **LLM**: 13B TIGER-OM (Q4_K_M GGUF)
+- **RAG Framework**: SKT RAG + ADK Kit
+- **Agent Framework**: LangGraph
+- **GPU Stack**: ROCm 7.0 + AMD ADK Kit
+- **Inference**: vLLM (FP16) + llama.cpp (GGUF Q4_K_M)
+- **Hardware**: AMD MI300X
+- **Cloud**: AMD Developer Cloud
+---
+## 🚀 Quick Start - GGUF Q4_K_M
 ```bash
+# Using llama.cpp
+./llama-cli \
+  -m tiger-om-q4_k_m.gguf \
+  -p "Your complex query here..." \
+  -n 1024 \
+  -t 8 \
+  --temp 0.7
 ```
+**Python Example (llama-cpp-python)**
+```python
+from llama_cpp import Llama
+llm = Llama(
+    model_path="tiger-om-q4_k_m.gguf",
+    n_gpu_layers=-1,      # Use all GPU layers
+    n_ctx=8192,
+    verbose=False
+)
+response = llm.create_chat_completion(
+    messages=[{"role": "user", "content": "Explain..."}],
+    temperature=0.7,
+    max_tokens=1024
+)
+print(response['choices'][0]['message']['content'])
 ```
+---
+## 📁 Repository Structure
+- `/skt_ai_labs` — Core ADK + RAG integration
+- `/plugins` — Plugin system
+- `/agents` — LangGraph workflows
+- `/examples` — Ready-to-use examples
+- `/docs` — Architecture & guides
+---
+## 🏆 Hackathon Information
+- **Event**: AMD Developer Hackathon 2026
+- **Trained on**: AMD Developer Cloud ($100 credits)
+- **Built in Public**: Regular technical updates shared
+- **Goal**: Showcasing powerful agentic AI on AMD ROCm ecosystem
+---
+## 📄 License
+*MIT*