qwen-api / README.md
Ngixdev's picture
Switch to Docker SDK with CUDA for llama-cpp
31b5080 verified
|
raw
history blame
1.61 kB
metadata
title: Qwen API
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - qwen
  - uncensored
  - llama-cpp
  - gguf
suggested_hardware: a10g-small

Qwen3.5-9B Uncensored API Interface

API interface for HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive.

Features

  • 9B parameters with 262K context window
  • Fully uncensored (0/465 refusals)
  • Multimodal capable (text, image, video)
  • Supports 201 languages
  • Q4_K_M quantization via llama.cpp

API Usage

Python

from gradio_client import Client

client = Client("Ngixdev/qwen-api")

result = client.predict(
    prompt="Your question here",
    system_prompt="You are a helpful assistant",
    temperature=0.7,
    top_p=0.8,
    max_tokens=1024,
    api_name="/api_generate"
)
print(result)

cURL

curl -X POST https://ngixdev-qwen-api.hf.space/api/api_generate \
    -H "Content-Type: application/json" \
    -d '{
        "data": [
            "Your question here",
            "You are a helpful assistant",
            0.7,
            0.8,
            1024
        ]
    }'

Parameters

Parameter Type Default Description
prompt string required User prompt/question
system_prompt string "" System instruction
temperature float 0.7 Sampling temperature (0.0-2.0)
top_p float 0.8 Nucleus sampling (0.0-1.0)
max_tokens int 1024 Maximum tokens to generate