Upload 12 files
Browse files- README.md +61 -70
- app/main.py +344 -0
- static/style.css +3 -4
README.md
CHANGED
|
@@ -1,100 +1,91 @@
|
|
| 1 |
-
|
| 2 |
-
title: NVIDIA NIM
|
| 3 |
sdk: docker
|
| 4 |
app_port: 7860
|
| 5 |
pinned: false
|
| 6 |
---
|
| 7 |
|
| 8 |
-
# NVIDIA NIM
|
| 9 |
-
|
| 10 |
-
这是一个面向公开使用的 NVIDIA NIM 到 OpenAI `/v1/responses` 兼容网关。
|
| 11 |
-
|
| 12 |
-
它不在本地保存任何用户的 NIM API Key。用户调用本项目时,需要自己通过请求头携带 NIM Key,网关只负责协议转换、性能优化、聚合统计和官方模型目录展示。
|
| 13 |
-
|
| 14 |
-
## 主要能力
|
| 15 |
-
|
| 16 |
-
- 将 NVIDIA 官方 `POST /v1/chat/completions` 转换为 OpenAI 风格的 `POST /v1/responses`
|
| 17 |
-
- 支持 tool calling / function calling
|
| 18 |
-
- 支持 `function_call_output` 回灌
|
| 19 |
-
- 支持 `previous_response_id` 对话续写
|
| 20 |
-
- 对 `/v1/responses` 和 `/v1/responses/{response_id}` 使用用户自带的 NIM Key 做鉴权与上游转发
|
| 21 |
-
- `/v1/models` 直接返回来自 NVIDIA 官方 `/v1/models` 的同步结果,保持 OpenAI 风格结构
|
| 22 |
-
- `/` 为白色主题的模型健康度页面,按 10 分钟成功率矩阵展示 MODEL_LIST 中的模型
|
| 23 |
-
- `/models` 为独立的白色主题官方模型列表页面,支持按提供商筛选模型
|
| 24 |
-
- 模型提供商卡片为固定高度,避免模型较多时卡片过长
|
| 25 |
-
- 使用共享 HTTP 连接池、SQLite WAL 和异步线程化落库来增强高并发场景下的转发性能
|
| 26 |
|
| 27 |
-
|
| 28 |
|
| 29 |
-
|
| 30 |
|
| 31 |
-
|
| 32 |
-
- `X-API-Key: <你的 NIM Key>`
|
| 33 |
|
| 34 |
-
|
| 35 |
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
-
|
| 41 |
|
| 42 |
-
|
|
|
|
| 43 |
|
| 44 |
-
|
| 45 |
-
- `GET /models`
|
| 46 |
-
- `GET /api/catalog`
|
| 47 |
|
| 48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
|
| 50 |
-
|
| 51 |
|
| 52 |
-
- `GET /`
|
| 53 |
-
- `GET /
|
| 54 |
|
| 55 |
-
|
| 56 |
|
| 57 |
- `GET /api/dashboard`
|
| 58 |
- `GET /api/catalog`
|
| 59 |
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
- `POST /v1/responses`
|
| 63 |
-
- `GET /v1/responses/{response_id}`
|
| 64 |
-
- `GET /v1/models`
|
| 65 |
-
|
| 66 |
-
## 环境变量
|
| 67 |
|
| 68 |
-
-
|
| 69 |
-
-
|
| 70 |
-
-
|
| 71 |
-
-
|
| 72 |
-
-
|
| 73 |
-
- `MAX_UPSTREAM_CONNECTIONS`:共享连接池最大连接数,默认 `512`
|
| 74 |
-
- `MAX_KEEPALIVE_CONNECTIONS`:共享连接池最大 keep-alive 连接数,默认 `128`
|
| 75 |
-
- `DATABASE_PATH`:默认 `./data.sqlite3`
|
| 76 |
|
| 77 |
-
##
|
| 78 |
|
| 79 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
|
| 81 |
-
|
| 82 |
-
- 通过 `scripts/local_smoke_test.py` 验证了协议转换、官方模型同步、用户 Key 鉴权、`previous_response_id`、tool call、健康页数据接口、模型页数据接口和两个独立页面路由。
|
| 83 |
|
| 84 |
-
|
| 85 |
-
- 通过 `scripts/live_e2e_validation.py` 使用提供的测试 NIM Key,真实调用了 NVIDIA 官方模型目录和实际模型响应。
|
| 86 |
-
- 实测结果:`live_gateway_ok`,并成功通过 `z-ai/glm5` 得到 `OK`。
|
| 87 |
|
| 88 |
-
|
|
|
|
| 89 |
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
4. 启动后即可直接公开使用
|
| 94 |
|
| 95 |
-
##
|
| 96 |
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: NVIDIA NIM ????
|
| 3 |
sdk: docker
|
| 4 |
app_port: 7860
|
| 5 |
pinned: false
|
| 6 |
---
|
| 7 |
|
| 8 |
+
# NVIDIA NIM ????
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
+
??????????? NVIDIA NIM ????????? OpenAI Responses API ? Anthropic Claude Messages API?
|
| 11 |
|
| 12 |
+
???????????????????? NIM Key????????????? Key????????????? Key ?????????
|
| 13 |
|
| 14 |
+
## ????
|
|
|
|
| 15 |
|
| 16 |
+
OpenAI ???
|
| 17 |
|
| 18 |
+
- `POST /v1/responses`
|
| 19 |
+
- `POST /responses`
|
| 20 |
+
- `GET /v1/responses/{response_id}`
|
| 21 |
+
- `GET /responses/{response_id}`
|
| 22 |
+
- `GET /v1/models`
|
| 23 |
+
- `GET /models`
|
| 24 |
|
| 25 |
+
Anthropic Claude ???
|
| 26 |
|
| 27 |
+
- `POST /v1/messages`
|
| 28 |
+
- `POST /messages`
|
| 29 |
|
| 30 |
+
## Claude ????
|
|
|
|
|
|
|
| 31 |
|
| 32 |
+
- ?? Anthropic `Messages API` ? `messages`?`system`?`max_tokens`?`tools`?`tool_choice`
|
| 33 |
+
- ?? Claude ?? `tool_use` ? `tool_result`
|
| 34 |
+
- ?? Claude Code ??? Anthropic-defined tools ???????
|
| 35 |
+
- `bash_20250124`
|
| 36 |
+
- `text_editor_20250728`
|
| 37 |
+
- ?? Claude ?? SSE ?????
|
| 38 |
+
- `message_start`
|
| 39 |
+
- `content_block_start`
|
| 40 |
+
- `content_block_delta`
|
| 41 |
+
- `content_block_stop`
|
| 42 |
+
- `message_delta`
|
| 43 |
+
- `message_stop`
|
| 44 |
|
| 45 |
+
## ??
|
| 46 |
|
| 47 |
+
- `GET /`????????
|
| 48 |
+
- `GET /model_list`?????????
|
| 49 |
|
| 50 |
+
## ??????
|
| 51 |
|
| 52 |
- `GET /api/dashboard`
|
| 53 |
- `GET /api/catalog`
|
| 54 |
|
| 55 |
+
## ????
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
|
| 57 |
+
- ???? NVIDIA ???????`https://integrate.api.nvidia.com/v1/models`
|
| 58 |
+
- ????????????????
|
| 59 |
+
- ???????????
|
| 60 |
+
- ??????????????????
|
| 61 |
+
- ?? HTTP ????SQLite WAL???????????????
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
+
## ????
|
| 64 |
|
| 65 |
+
- `NVIDIA_API_BASE`??? `https://integrate.api.nvidia.com/v1`
|
| 66 |
+
- `MODEL_LIST`?????????????????
|
| 67 |
+
- `APP_TIMEZONE`??? `Asia/Shanghai`
|
| 68 |
+
- `MODEL_SYNC_INTERVAL_MINUTES`?????????????? `30`
|
| 69 |
+
- `PUBLIC_HISTORY_BUCKETS`??????????? 10 ???????? `22`
|
| 70 |
+
- `REQUEST_TIMEOUT_SECONDS`?????????? `90`
|
| 71 |
+
- `MAX_UPSTREAM_CONNECTIONS`?????????????? `512`
|
| 72 |
+
- `MAX_KEEPALIVE_CONNECTIONS`???????? keep-alive ?????? `128`
|
| 73 |
+
- `DATABASE_PATH`??? `./data.sqlite3`
|
| 74 |
|
| 75 |
+
## ????
|
|
|
|
| 76 |
|
| 77 |
+
????????????
|
|
|
|
|
|
|
| 78 |
|
| 79 |
+
1. Mock ???
|
| 80 |
+
- ?? `scripts/local_smoke_test.py` ??? OpenAI Responses?Claude Messages?`tool_use`/`tool_result`?????????? Key ?????????????
|
| 81 |
|
| 82 |
+
2. ???????
|
| 83 |
+
- ?? `scripts/live_e2e_validation.py` ???? NIM Key ????? NVIDIA ??????????????
|
| 84 |
+
- ?????`live_gateway_ok`???? `z-ai/glm5` ?? `OK`?
|
|
|
|
| 85 |
|
| 86 |
+
## ??? Hugging Face Space
|
| 87 |
|
| 88 |
+
1. ?? Hugging Face Space?SDK ?? `Docker`
|
| 89 |
+
2. ? `hf_space` ???????? Space ?????
|
| 90 |
+
3. ????????
|
| 91 |
+
4. ???????????
|
app/main.py
CHANGED
|
@@ -619,6 +619,302 @@ def chat_completion_to_response(body: dict[str, Any], upstream_json: dict[str, A
|
|
| 619 |
}
|
| 620 |
|
| 621 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 622 |
def store_success_record(api_key_hash: str, model_id: str, request_body: dict[str, Any], input_items: list[dict[str, Any]], response_payload: dict[str, Any], latency_ms: float) -> None:
|
| 623 |
conn = get_db_connection()
|
| 624 |
try:
|
|
@@ -947,6 +1243,54 @@ async def get_response(response_id: str, api_key: str = Depends(extract_user_api
|
|
| 947 |
return await fetch_response_record(response_id, api_key)
|
| 948 |
|
| 949 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 950 |
@app.post("/v1/responses")
|
| 951 |
async def create_response_v1(request: Request, api_key: str = Depends(extract_user_api_key)):
|
| 952 |
return await create_response_impl(request, api_key)
|
|
|
|
| 619 |
}
|
| 620 |
|
| 621 |
|
| 622 |
+
def anthropic_text_from_blocks(blocks: list[dict[str, Any]] | str | None) -> str:
|
| 623 |
+
if blocks is None:
|
| 624 |
+
return ""
|
| 625 |
+
if isinstance(blocks, str):
|
| 626 |
+
return blocks
|
| 627 |
+
if not isinstance(blocks, list):
|
| 628 |
+
return json_dumps(blocks)
|
| 629 |
+
parts: list[str] = []
|
| 630 |
+
for block in blocks:
|
| 631 |
+
if not isinstance(block, dict):
|
| 632 |
+
parts.append(str(block))
|
| 633 |
+
continue
|
| 634 |
+
if block.get("type") == "text":
|
| 635 |
+
text_value = block.get("text")
|
| 636 |
+
if text_value:
|
| 637 |
+
parts.append(str(text_value))
|
| 638 |
+
return "\n".join(parts).strip()
|
| 639 |
+
|
| 640 |
+
|
| 641 |
+
def anthropic_tool_result_to_text(content: Any) -> str:
|
| 642 |
+
if isinstance(content, str):
|
| 643 |
+
return content
|
| 644 |
+
if isinstance(content, list):
|
| 645 |
+
text_value = anthropic_text_from_blocks(content)
|
| 646 |
+
return text_value if text_value else json_dumps(content)
|
| 647 |
+
if isinstance(content, dict):
|
| 648 |
+
if content.get("type") == "text":
|
| 649 |
+
return str(content.get("text", ""))
|
| 650 |
+
return json_dumps(content)
|
| 651 |
+
if content is None:
|
| 652 |
+
return ""
|
| 653 |
+
return str(content)
|
| 654 |
+
|
| 655 |
+
|
| 656 |
+
def anthropic_system_to_text(system: Any) -> str:
|
| 657 |
+
if isinstance(system, str):
|
| 658 |
+
return system
|
| 659 |
+
if isinstance(system, list):
|
| 660 |
+
return anthropic_text_from_blocks(system)
|
| 661 |
+
return ""
|
| 662 |
+
|
| 663 |
+
|
| 664 |
+
def anthropic_defined_tool_to_chat_tool(tool: dict[str, Any]) -> dict[str, Any]:
|
| 665 |
+
tool_type = str(tool.get("type") or "")
|
| 666 |
+
name = tool.get("name") or ("bash" if tool_type.startswith("bash_") else "str_replace_based_edit_tool")
|
| 667 |
+
if tool_type.startswith("bash_"):
|
| 668 |
+
return {
|
| 669 |
+
"type": "function",
|
| 670 |
+
"function": {
|
| 671 |
+
"name": name,
|
| 672 |
+
"description": "Run shell commands in a persistent bash session. Use command for execution, and restart=true to reset the shell session.",
|
| 673 |
+
"parameters": {
|
| 674 |
+
"type": "object",
|
| 675 |
+
"properties": {
|
| 676 |
+
"command": {"type": "string", "description": "The bash command to run."},
|
| 677 |
+
"restart": {"type": "boolean", "description": "Set to true to restart the bash session."},
|
| 678 |
+
},
|
| 679 |
+
"additionalProperties": False,
|
| 680 |
+
},
|
| 681 |
+
},
|
| 682 |
+
}
|
| 683 |
+
if tool_type.startswith("text_editor_"):
|
| 684 |
+
return {
|
| 685 |
+
"type": "function",
|
| 686 |
+
"function": {
|
| 687 |
+
"name": name,
|
| 688 |
+
"description": "View and edit text files. Supported commands are view, str_replace, create, and insert.",
|
| 689 |
+
"parameters": {
|
| 690 |
+
"type": "object",
|
| 691 |
+
"properties": {
|
| 692 |
+
"command": {
|
| 693 |
+
"type": "string",
|
| 694 |
+
"enum": ["view", "str_replace", "create", "insert"],
|
| 695 |
+
"description": "The text editor command to execute.",
|
| 696 |
+
},
|
| 697 |
+
"path": {"type": "string", "description": "Path to the target file or directory."},
|
| 698 |
+
"view_range": {
|
| 699 |
+
"type": "array",
|
| 700 |
+
"items": {"type": "integer"},
|
| 701 |
+
"minItems": 2,
|
| 702 |
+
"maxItems": 2,
|
| 703 |
+
"description": "Optional line range when using view.",
|
| 704 |
+
},
|
| 705 |
+
"old_str": {"type": "string", "description": "Text to replace when using str_replace."},
|
| 706 |
+
"new_str": {"type": "string", "description": "Replacement text when using str_replace."},
|
| 707 |
+
"file_text": {"type": "string", "description": "Content to write when using create."},
|
| 708 |
+
"insert_line": {"type": "integer", "description": "Line index after which to insert text."},
|
| 709 |
+
"insert_text": {"type": "string", "description": "Text to insert when using insert."},
|
| 710 |
+
},
|
| 711 |
+
"required": ["command", "path"],
|
| 712 |
+
"additionalProperties": False,
|
| 713 |
+
},
|
| 714 |
+
},
|
| 715 |
+
}
|
| 716 |
+
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=f"???? Claude ???? {tool_type}?")
|
| 717 |
+
|
| 718 |
+
|
| 719 |
+
def anthropic_tools_to_chat_tools(tools: list[dict[str, Any]] | None) -> list[dict[str, Any]]:
|
| 720 |
+
normalized: list[dict[str, Any]] = []
|
| 721 |
+
for tool in tools or []:
|
| 722 |
+
if not isinstance(tool, dict):
|
| 723 |
+
continue
|
| 724 |
+
tool_type = tool.get("type")
|
| 725 |
+
if isinstance(tool_type, str) and (tool_type.startswith("bash_") or tool_type.startswith("text_editor_")):
|
| 726 |
+
normalized.append(anthropic_defined_tool_to_chat_tool(tool))
|
| 727 |
+
continue
|
| 728 |
+
name = tool.get("name")
|
| 729 |
+
if not name:
|
| 730 |
+
continue
|
| 731 |
+
normalized.append(
|
| 732 |
+
{
|
| 733 |
+
"type": "function",
|
| 734 |
+
"function": {
|
| 735 |
+
"name": name,
|
| 736 |
+
"description": tool.get("description"),
|
| 737 |
+
"parameters": tool.get("input_schema") or {"type": "object", "properties": {}},
|
| 738 |
+
},
|
| 739 |
+
}
|
| 740 |
+
)
|
| 741 |
+
return normalized
|
| 742 |
+
|
| 743 |
+
|
| 744 |
+
def anthropic_tool_choice_to_chat(tool_choice: dict[str, Any] | None) -> Any:
|
| 745 |
+
if not tool_choice:
|
| 746 |
+
return None
|
| 747 |
+
choice_type = tool_choice.get("type")
|
| 748 |
+
if choice_type == "auto":
|
| 749 |
+
return "auto"
|
| 750 |
+
if choice_type == "any":
|
| 751 |
+
return "required"
|
| 752 |
+
if choice_type == "none":
|
| 753 |
+
return "none"
|
| 754 |
+
if choice_type == "tool":
|
| 755 |
+
return {"type": "function", "function": {"name": tool_choice.get("name")}}
|
| 756 |
+
return None
|
| 757 |
+
|
| 758 |
+
|
| 759 |
+
def anthropic_messages_to_chat_messages(body: dict[str, Any]) -> list[dict[str, Any]]:
|
| 760 |
+
messages: list[dict[str, Any]] = []
|
| 761 |
+
system_text = anthropic_system_to_text(body.get("system"))
|
| 762 |
+
if system_text:
|
| 763 |
+
messages.append({"role": "system", "content": system_text})
|
| 764 |
+
|
| 765 |
+
for message in body.get("messages") or []:
|
| 766 |
+
role = message.get("role", "user")
|
| 767 |
+
content = message.get("content")
|
| 768 |
+
if isinstance(content, str):
|
| 769 |
+
messages.append({"role": role, "content": content})
|
| 770 |
+
continue
|
| 771 |
+
if not isinstance(content, list):
|
| 772 |
+
continue
|
| 773 |
+
|
| 774 |
+
text_parts: list[str] = []
|
| 775 |
+
tool_calls: list[dict[str, Any]] = []
|
| 776 |
+
tool_results: list[dict[str, Any]] = []
|
| 777 |
+
for block in content:
|
| 778 |
+
if not isinstance(block, dict):
|
| 779 |
+
continue
|
| 780 |
+
block_type = block.get("type")
|
| 781 |
+
if block_type == "text":
|
| 782 |
+
text_value = block.get("text")
|
| 783 |
+
if text_value:
|
| 784 |
+
text_parts.append(str(text_value))
|
| 785 |
+
elif block_type == "tool_use" and role == "assistant":
|
| 786 |
+
tool_input = block.get("input") if isinstance(block.get("input"), dict) else {}
|
| 787 |
+
tool_calls.append(
|
| 788 |
+
{
|
| 789 |
+
"id": block.get("id") or f"toolu_{uuid.uuid4().hex[:24]}",
|
| 790 |
+
"type": "function",
|
| 791 |
+
"function": {
|
| 792 |
+
"name": block.get("name"),
|
| 793 |
+
"arguments": json_dumps(tool_input),
|
| 794 |
+
},
|
| 795 |
+
}
|
| 796 |
+
)
|
| 797 |
+
elif block_type == "tool_result" and role == "user":
|
| 798 |
+
result_text = anthropic_tool_result_to_text(block.get("content"))
|
| 799 |
+
if block.get("is_error"):
|
| 800 |
+
result_text = f"[tool_error]\n{result_text}"
|
| 801 |
+
tool_results.append(
|
| 802 |
+
{
|
| 803 |
+
"role": "tool",
|
| 804 |
+
"tool_call_id": block.get("tool_use_id"),
|
| 805 |
+
"content": result_text,
|
| 806 |
+
}
|
| 807 |
+
)
|
| 808 |
+
if role == "assistant":
|
| 809 |
+
if tool_calls:
|
| 810 |
+
messages.append({"role": "assistant", "content": "\n".join(text_parts), "tool_calls": tool_calls})
|
| 811 |
+
elif text_parts:
|
| 812 |
+
messages.append({"role": "assistant", "content": "\n".join(text_parts)})
|
| 813 |
+
elif role == "user":
|
| 814 |
+
messages.extend(tool_results)
|
| 815 |
+
if text_parts:
|
| 816 |
+
messages.append({"role": "user", "content": "\n".join(text_parts)})
|
| 817 |
+
return messages
|
| 818 |
+
|
| 819 |
+
|
| 820 |
+
def build_chat_payload_from_anthropic(body: dict[str, Any]) -> dict[str, Any]:
|
| 821 |
+
payload: dict[str, Any] = {
|
| 822 |
+
"model": body.get("model"),
|
| 823 |
+
"messages": anthropic_messages_to_chat_messages(body),
|
| 824 |
+
"max_tokens": body.get("max_tokens"),
|
| 825 |
+
}
|
| 826 |
+
if body.get("temperature") is not None:
|
| 827 |
+
payload["temperature"] = body.get("temperature")
|
| 828 |
+
if body.get("top_p") is not None:
|
| 829 |
+
payload["top_p"] = body.get("top_p")
|
| 830 |
+
if body.get("stop_sequences"):
|
| 831 |
+
payload["stop"] = body.get("stop_sequences")
|
| 832 |
+
tools = anthropic_tools_to_chat_tools(body.get("tools"))
|
| 833 |
+
if tools:
|
| 834 |
+
payload["tools"] = tools
|
| 835 |
+
tool_choice = anthropic_tool_choice_to_chat(body.get("tool_choice"))
|
| 836 |
+
if tool_choice is not None:
|
| 837 |
+
payload["tool_choice"] = tool_choice
|
| 838 |
+
return payload
|
| 839 |
+
|
| 840 |
+
|
| 841 |
+
def anthropic_stop_reason(finish_reason: str | None, has_tool_use: bool) -> str:
|
| 842 |
+
if has_tool_use or finish_reason in {"tool_calls", "tool_call"}:
|
| 843 |
+
return "tool_use"
|
| 844 |
+
if finish_reason == "length":
|
| 845 |
+
return "max_tokens"
|
| 846 |
+
if finish_reason == "stop_sequence":
|
| 847 |
+
return "stop_sequence"
|
| 848 |
+
return "end_turn"
|
| 849 |
+
|
| 850 |
+
|
| 851 |
+
def chat_completion_to_anthropic_message(body: dict[str, Any], upstream_json: dict[str, Any]) -> dict[str, Any]:
|
| 852 |
+
upstream_message, finish_reason = extract_upstream_message(upstream_json)
|
| 853 |
+
assistant_text, tool_calls = extract_text_and_tool_calls(upstream_message)
|
| 854 |
+
content_blocks: list[dict[str, Any]] = []
|
| 855 |
+
if assistant_text:
|
| 856 |
+
content_blocks.append({"type": "text", "text": assistant_text})
|
| 857 |
+
for tool_call in tool_calls:
|
| 858 |
+
arguments = tool_call.get("arguments") or "{}"
|
| 859 |
+
try:
|
| 860 |
+
parsed_input = json.loads(arguments)
|
| 861 |
+
except Exception:
|
| 862 |
+
parsed_input = {"raw": arguments}
|
| 863 |
+
content_blocks.append(
|
| 864 |
+
{
|
| 865 |
+
"type": "tool_use",
|
| 866 |
+
"id": tool_call["id"],
|
| 867 |
+
"name": tool_call.get("name"),
|
| 868 |
+
"input": parsed_input,
|
| 869 |
+
}
|
| 870 |
+
)
|
| 871 |
+
usage = upstream_json.get("usage") or {}
|
| 872 |
+
return {
|
| 873 |
+
"id": f"msg_{uuid.uuid4().hex}",
|
| 874 |
+
"type": "message",
|
| 875 |
+
"role": "assistant",
|
| 876 |
+
"model": body.get("model"),
|
| 877 |
+
"content": content_blocks,
|
| 878 |
+
"stop_reason": anthropic_stop_reason(finish_reason, bool(tool_calls)),
|
| 879 |
+
"stop_sequence": None,
|
| 880 |
+
"usage": {
|
| 881 |
+
"input_tokens": usage.get("prompt_tokens"),
|
| 882 |
+
"output_tokens": usage.get("completion_tokens"),
|
| 883 |
+
},
|
| 884 |
+
}
|
| 885 |
+
|
| 886 |
+
|
| 887 |
+
def anthropic_message_start_payload(message: dict[str, Any]) -> dict[str, Any]:
|
| 888 |
+
usage = message.get("usage") or {}
|
| 889 |
+
return {
|
| 890 |
+
"type": "message_start",
|
| 891 |
+
"message": {
|
| 892 |
+
"id": message.get("id"),
|
| 893 |
+
"type": "message",
|
| 894 |
+
"role": "assistant",
|
| 895 |
+
"model": message.get("model"),
|
| 896 |
+
"content": [],
|
| 897 |
+
"stop_reason": None,
|
| 898 |
+
"stop_sequence": None,
|
| 899 |
+
"usage": {
|
| 900 |
+
"input_tokens": usage.get("input_tokens"),
|
| 901 |
+
"output_tokens": 0,
|
| 902 |
+
},
|
| 903 |
+
},
|
| 904 |
+
}
|
| 905 |
+
|
| 906 |
+
|
| 907 |
+
def anthropic_message_delta_payload(message: dict[str, Any]) -> dict[str, Any]:
|
| 908 |
+
return {
|
| 909 |
+
"type": "message_delta",
|
| 910 |
+
"delta": {
|
| 911 |
+
"stop_reason": message.get("stop_reason"),
|
| 912 |
+
"stop_sequence": message.get("stop_sequence"),
|
| 913 |
+
},
|
| 914 |
+
"usage": message.get("usage") or {},
|
| 915 |
+
}
|
| 916 |
+
|
| 917 |
+
|
| 918 |
def store_success_record(api_key_hash: str, model_id: str, request_body: dict[str, Any], input_items: list[dict[str, Any]], response_payload: dict[str, Any], latency_ms: float) -> None:
|
| 919 |
conn = get_db_connection()
|
| 920 |
try:
|
|
|
|
| 1243 |
return await fetch_response_record(response_id, api_key)
|
| 1244 |
|
| 1245 |
|
| 1246 |
+
@app.post("/v1/messages")
|
| 1247 |
+
async def create_messages_v1(request: Request, api_key: str = Depends(extract_user_api_key)):
|
| 1248 |
+
return await create_messages_impl(request, api_key)
|
| 1249 |
+
|
| 1250 |
+
|
| 1251 |
+
@app.post("/messages")
|
| 1252 |
+
async def create_messages(request: Request, api_key: str = Depends(extract_user_api_key)):
|
| 1253 |
+
return await create_messages_impl(request, api_key)
|
| 1254 |
+
|
| 1255 |
+
|
| 1256 |
+
async def create_messages_impl(request: Request, api_key: str):
|
| 1257 |
+
body = await request.json()
|
| 1258 |
+
if not isinstance(body, dict):
|
| 1259 |
+
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="?????? JSON ???")
|
| 1260 |
+
if not body.get("model"):
|
| 1261 |
+
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="?? model ???")
|
| 1262 |
+
if body.get("max_tokens") is None:
|
| 1263 |
+
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="?? max_tokens ???")
|
| 1264 |
+
if not body.get("messages"):
|
| 1265 |
+
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="?? messages ???")
|
| 1266 |
+
|
| 1267 |
+
chat_payload = build_chat_payload_from_anthropic(body)
|
| 1268 |
+
try:
|
| 1269 |
+
upstream_json, _latency_ms = await post_nvidia_chat_completion(api_key, chat_payload)
|
| 1270 |
+
except HTTPException as exc:
|
| 1271 |
+
await run_db(store_failure_metric, body.get("model"), exc.detail)
|
| 1272 |
+
raise exc
|
| 1273 |
+
|
| 1274 |
+
anthropic_message = chat_completion_to_anthropic_message(body, upstream_json)
|
| 1275 |
+
|
| 1276 |
+
if body.get("stream"):
|
| 1277 |
+
async def event_stream() -> Any:
|
| 1278 |
+
yield f"event: message_start\ndata: {json_dumps(anthropic_message_start_payload(anthropic_message))}\n\n"
|
| 1279 |
+
for index, block in enumerate(anthropic_message.get("content") or []):
|
| 1280 |
+
yield f"event: content_block_start\ndata: {json_dumps({'type': 'content_block_start', 'index': index, 'content_block': block})}\n\n"
|
| 1281 |
+
if block.get("type") == "text":
|
| 1282 |
+
yield f"event: content_block_delta\ndata: {json_dumps({'type': 'content_block_delta', 'index': index, 'delta': {'type': 'text_delta', 'text': block.get('text', '')}})}\n\n"
|
| 1283 |
+
elif block.get("type") == "tool_use":
|
| 1284 |
+
partial_json = json_dumps(block.get("input") or {})
|
| 1285 |
+
yield f"event: content_block_delta\ndata: {json_dumps({'type': 'content_block_delta', 'index': index, 'delta': {'type': 'input_json_delta', 'partial_json': partial_json}})}\n\n"
|
| 1286 |
+
yield f"event: content_block_stop\ndata: {json_dumps({'type': 'content_block_stop', 'index': index})}\n\n"
|
| 1287 |
+
yield f"event: message_delta\ndata: {json_dumps(anthropic_message_delta_payload(anthropic_message))}\n\n"
|
| 1288 |
+
yield "event: message_stop\ndata: {\"type\":\"message_stop\"}\n\n"
|
| 1289 |
+
return StreamingResponse(event_stream(), media_type="text/event-stream")
|
| 1290 |
+
|
| 1291 |
+
return anthropic_message
|
| 1292 |
+
|
| 1293 |
+
|
| 1294 |
@app.post("/v1/responses")
|
| 1295 |
async def create_response_v1(request: Request, api_key: str = Depends(extract_user_api_key)):
|
| 1296 |
return await create_response_impl(request, api_key)
|
static/style.css
CHANGED
|
@@ -468,15 +468,14 @@ body {
|
|
| 468 |
.provider-model-chip {
|
| 469 |
display: flex;
|
| 470 |
align-items: center;
|
| 471 |
-
min-height:
|
| 472 |
-
|
| 473 |
-
padding: 0 14px;
|
| 474 |
border-radius: 16px;
|
| 475 |
background: var(--surface-soft);
|
| 476 |
border: 1px solid var(--line);
|
| 477 |
font-family: var(--font-display);
|
| 478 |
font-size: 13px;
|
| 479 |
-
line-height: 1;
|
| 480 |
color: var(--text);
|
| 481 |
white-space: nowrap;
|
| 482 |
overflow: hidden;
|
|
|
|
| 468 |
.provider-model-chip {
|
| 469 |
display: flex;
|
| 470 |
align-items: center;
|
| 471 |
+
min-height: 48px;
|
| 472 |
+
padding: 11px 14px;
|
|
|
|
| 473 |
border-radius: 16px;
|
| 474 |
background: var(--surface-soft);
|
| 475 |
border: 1px solid var(--line);
|
| 476 |
font-family: var(--font-display);
|
| 477 |
font-size: 13px;
|
| 478 |
+
line-height: 1.35;
|
| 479 |
color: var(--text);
|
| 480 |
white-space: nowrap;
|
| 481 |
overflow: hidden;
|