Spaces:

Lablab-ai
/

amd-gradio-workshop-demo

Running

stevekimoi Claude Sonnet 4.6 commited on 3 days ago

Commit

1f14b87

1 Parent(s): 6f28b5c

Add AMD Gradio demo app with vLLM chat interface

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Files changed (4) hide show

.gitignore ADDED Viewed

+venv/
+.env
+__pycache__/
+*.pyc
+.DS_Store

README.md CHANGED Viewed

@@ -1,15 +1,29 @@
 ---
-title: Amd Gradio Workshop Demo
-emoji: 🌖
-colorFrom: indigo
-colorTo: blue
 sdk: gradio
-sdk_version: 6.14.0
-python_version: '3.13'
 app_file: app.py
 pinned: false
 license: mit
-short_description: This is just a workshop demo
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: AMD HuggingFace Workshop Demo
+emoji: 🚀
+colorFrom: red
+colorTo: yellow
 sdk: gradio
+sdk_version: 4.44.1
 app_file: app.py
 pinned: false
 license: mit
+tags:
+  - amd
+  - amd-hackathon-2026
+  - vllm
+  - gradio
 ---
+# AMD MI300X AI Demo
+A Gradio chat interface connected to a vLLM endpoint running on AMD MI300X GPU.
+## Setup
+Add these as Space secrets (Settings → Variables and secrets):
+| Secret | Value |
+|--------|-------|
+| `VLLM_BASE_URL` | Your AMD vLLM endpoint, e.g. `http://your-ip:8000/v1` |
+| `MODEL_NAME` | Model ID loaded by vLLM, e.g. `Qwen/Qwen2.5-1.5B-Instruct` |

app.py ADDED Viewed

+import os
+import gradio as gr
+from openai import OpenAI
+from dotenv import load_dotenv
+load_dotenv()
+VLLM_BASE_URL = os.environ.get("VLLM_BASE_URL", "http://localhost:8000/v1")
+MODEL_NAME = os.environ.get("MODEL_NAME", "meta-llama/Llama-3.1-8B-Instruct")
+client = OpenAI(base_url=VLLM_BASE_URL, api_key="not-required")
+def chat(message, history):
+    messages = [{"role": "system", "content": "You are a helpful assistant."}]
+    for item in history:
+        if isinstance(item, dict):
+            messages.append({"role": item["role"], "content": item["content"]})
+        else:
+            messages.append({"role": "user", "content": item[0]})
+            if item[1]:
+                messages.append({"role": "assistant", "content": item[1]})
+    messages.append({"role": "user", "content": message})
+    stream = client.chat.completions.create(
+        model=MODEL_NAME,
+        messages=messages,
+        stream=True,
+    )
+    partial = ""
+    for chunk in stream:
+        delta = chunk.choices[0].delta.content
+        if delta:
+            partial += delta
+            yield partial
+demo = gr.ChatInterface(
+    fn=chat,
+    title="AMD MI300X AI Demo",
+    description="Chat with an LLM running on AMD MI300X GPU via vLLM.",
+    examples=["Explain what AMD MI300X is.", "Write a Python hello world."],
+    cache_examples=False,
+)
+if __name__ == "__main__":
+    demo.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ openai>=1.0.0