cacodex commited on
Commit
9c1527e
·
verified ·
1 Parent(s): 2f1c18e

Upload 13 files

Browse files
.env.example CHANGED
@@ -1,10 +1,11 @@
1
  PASSWORD=change-me
2
  SESSION_SECRET=change-me-too
3
- GATEWAY_API_KEY=
4
  NVIDIA_API_BASE=https://integrate.api.nvidia.com/v1
5
  NVIDIA_NIM_API_KEY=
6
  HEALTHCHECK_INTERVAL_MINUTES=60
7
  HEALTHCHECK_PROMPT=请只回复 OK。
8
  PUBLIC_HISTORY_HOURS=48
 
 
9
  DATABASE_PATH=./data.sqlite3
10
-
 
1
  PASSWORD=change-me
2
  SESSION_SECRET=change-me-too
3
+ PASS_APIKEY=change-me-api-key
4
  NVIDIA_API_BASE=https://integrate.api.nvidia.com/v1
5
  NVIDIA_NIM_API_KEY=
6
  HEALTHCHECK_INTERVAL_MINUTES=60
7
  HEALTHCHECK_PROMPT=请只回复 OK。
8
  PUBLIC_HISTORY_HOURS=48
9
+ MAX_UPSTREAM_CONNECTIONS=256
10
+ MAX_KEEPALIVE_CONNECTIONS=64
11
  DATABASE_PATH=./data.sqlite3
 
README.md CHANGED
@@ -1,127 +1,136 @@
1
- ---
2
- title: NVIDIA NIM 响应网关
3
- sdk: docker
4
- app_port: 7860
5
- pinned: false
6
- ---
7
-
8
- # NVIDIA NIM 响应网关
9
-
10
- 这是一个基于 FastAPI 的兼容层项目,用来把 NVIDIA 官方接口:
11
-
12
- `https://integrate.api.nvidia.com/v1/chat/completions`
13
-
14
- 转换为 OpenAI 风格的 `/v1/responses` 接口,并附带一个公开健康看板和一个中文后台管理系统。
15
-
16
- ## 已支持能力
17
-
18
- - `POST /v1/responses`
19
- - `GET /v1/models`
20
- - `GET /v1/responses/{response_id}`
21
- - tool calling / function calling 转换
22
- - `function_call_output` 回灌转换
23
- - `previous_response_id` 对话续写
24
- - 模型管理
25
- - NVIDIA NIM Key 管理
26
- - 按小时健康巡检与公开状态页展示
27
- - Docker 方式部署到 Hugging Face Space
28
-
29
- ## 预置模型
30
-
31
- 首次启动会自动写入以下模型:
32
-
33
- - `z-ai/glm5`
34
- - `minimaxai/minimax-m2.5`
35
- - `moonshotai/kimi-k2.5`
36
- - `deepseek-ai/deepseek-v3.2`
37
- - `google/gemma-4-31b-it`
38
- - `qwen/qwen3.5-397b-a17b`
39
-
40
- 你也可以在后台继续添加、删除和测试模型。
41
-
42
- ## 页面与接口
43
-
44
- 公开页面:
45
-
46
- - `GET /` 模型健康度看板
47
- - `GET /api/health/public` 公开健康数据
48
-
49
- 兼容接口:
50
-
51
- - `POST /v1/responses`
52
- - `GET /v1/models`
53
- - `GET /v1/responses/{response_id}`
54
-
55
- 后台页面:
56
-
57
- - `GET /admin`
58
- - `POST /admin/api/login`
59
- - `GET /admin/api/overview`
60
- - `GET/POST/DELETE /admin/api/models...`
61
- - `GET/POST/DELETE /admin/api/keys...`
62
- - `GET /admin/api/healthchecks`
63
- - `POST /admin/api/healthchecks/run`
64
- - `GET/PUT /admin/api/settings`
65
-
66
- ## 环境变量
67
-
68
- - `PASSWORD`:后台登录密码,必填
69
- - `SESSION_SECRET`:后台会话签名密钥,可选;默认回退到 `PASSWORD`
70
- - `GATEWAY_API_KEY`:如果需要给 `/v1/models` 和 `/v1/responses` 再加一层 Bearer 保护,可以设置它
71
- - `NVIDIA_API_BASE`:默认 `https://integrate.api.nvidia.com/v1`
72
- - `NVIDIA_NIM_API_KEY`:可选首次启动时自动导入为默认 Key
73
- - `HEALTHCHECK_INTERVAL_MINUTES`:默认 `60`
74
- - `HEALTHCHECK_PROMPT`:默认 `请只回复 OK。`
75
- - `PUBLIC_HISTORY_HOURS`:默认 `48`
76
- - `DATABASE_PATH`:默认 `./data.sqlite3`
77
-
78
- 示例配置见 `.env.example`。
79
-
80
- ## 本地运行
81
-
82
- 安装运行依赖
83
-
84
- ```bash
85
- pip install -r requirements.txt
86
- ```
87
-
88
- 如需本地联调与 smoke test
89
-
90
- ```bash
91
- pip install -r requirements-dev.txt
92
- python scripts/local_smoke_test.py
93
- ```
94
-
95
- 启动服务:
96
-
97
- ```bash
98
- uvicorn app.main:app --host 0.0.0.0 --port 7860
99
- ```
100
-
101
- ## 部署到 Hugging Face Space
102
-
103
- 这个仓库已经按 Docker Space 准备好了部署文件。
104
-
105
- 1. 新建一个 Hugging Face Space,SDK 选择 `Docker`
106
- 2. 将 `hf_space` 目录内的内容作为 Space 根目录上传
107
- 3. Space Secrets 中至少配置 `PASSWORD` 和一个 NVIDIA NIM Key
108
- 4. 打开 `/admin`,确认 Key 可用,并执行一次巡检
109
-
110
- ## 本地验证情况
111
-
112
- 我已经通过本地 smoke test 验证了以下链路:
113
-
114
- - 中文首页与中文后台页面正常返回
115
- - HTML 响应头包含 `charset=utf-8`
116
- - `/v1/responses` 文回复转换正常
117
- - tool call / function call 转换正常
118
- - `function_call_output` 回灌到上游消息格式正常
119
- - `previous_response_id` 上下文拼接正常
120
- - 后台登录、手动巡检、公开健康同步正常
121
-
122
- ## 参考资料
123
-
124
- - OpenAI Responses API: https://platform.openai.com/docs/guides/responses-vs-chat-completions
125
- - OpenAI Function Calling: https://platform.openai.com/docs/guides/function-calling
126
- - NVIDIA Build: https://build.nvidia.com/
 
 
 
 
 
 
 
 
 
127
  - NVIDIA NIM API 文档: https://docs.api.nvidia.com/
 
1
+ ---
2
+ title: NVIDIA NIM 响应网关
3
+ sdk: docker
4
+ app_port: 7860
5
+ pinned: false
6
+ ---
7
+
8
+ # NVIDIA NIM 响应网关
9
+
10
+ 这是一个基于 FastAPI 的兼容层项目,用来把 NVIDIA 官方接口:
11
+
12
+ `https://integrate.api.nvidia.com/v1/chat/completions`
13
+
14
+ 转换为 OpenAI 风格的 `/v1/responses` 接口,并附带一个公开健康看板和一个中文后台管理系统。
15
+
16
+ ## 已支持能力
17
+
18
+ - `POST /v1/responses`
19
+ - `GET /v1/models`
20
+ - `GET /v1/responses/{response_id}`
21
+ - tool calling / function calling 转换
22
+ - `function_call_output` 回灌转换
23
+ - `previous_response_id` 对话续写
24
+ - `PASS_APIKEY` 鉴权保护 `/v1/responses`
25
+ - 多个 NVIDIA NIM Key 轮询分发
26
+ - 共享 HTTP 连接池,支持高并发转发
27
+ - 模型管理
28
+ - NVIDIA NIM Key 管理
29
+ - 后台一键测试全部模型
30
+ - 按小时健康巡检与公开状态页展示
31
+ - Docker 方式部署到 Hugging Face Space
32
+
33
+ ## 预置模型
34
+
35
+ 首次启动会自动写入以下模型:
36
+
37
+ - `z-ai/glm5`
38
+ - `minimaxai/minimax-m2.5`
39
+ - `moonshotai/kimi-k2.5`
40
+ - `deepseek-ai/deepseek-v3.2`
41
+ - `google/gemma-4-31b-it`
42
+ - `qwen/qwen3.5-397b-a17b`
43
+
44
+ 你也可以在后台继续添加、删除和测试模型。
45
+
46
+ ## 页面与接口
47
+
48
+ 公开页面:
49
+
50
+ - `GET /` 模型健康度看板
51
+ - `GET /api/health/public` 公开健康数据
52
+
53
+ 兼容接口:
54
+
55
+ - `POST /v1/responses`
56
+ - `GET /v1/models`
57
+ - `GET /v1/responses/{response_id}`
58
+
59
+ 后台页面:
60
+
61
+ - `GET /admin`
62
+ - `POST /admin/api/login`
63
+ - `GET /admin/api/overview`
64
+ - `GET/POST/DELETE /admin/api/models...`
65
+ - `GET/POST/DELETE /admin/api/keys...`
66
+ - `GET /admin/api/healthchecks`
67
+ - `POST /admin/api/healthchecks/run`
68
+ - `GET/PUT /admin/api/settings`
69
+
70
+ ## 环境变量
71
+
72
+ - `PASSWORD`:后台登录密码必填
73
+ - `SESSION_SECRET`:后台会话签名密钥,可选;默认回退到 `PASSWORD`
74
+ - `PASS_APIKEY`:外部调用 `/v1/responses` 时使用的鉴权密钥,支持 `Authorization: Bearer ...` 或 `X-API-Key`
75
+ - `NVIDIA_API_BASE`:默认 `https://integrate.api.nvidia.com/v1`
76
+ - `NVIDIA_NIM_API_KEY`:可选,首次启动时自动导入为默认 Key
77
+ - `HEALTHCHECK_INTERVAL_MINUTES`:默认 `60`
78
+ - `HEALTHCHECK_PROMPT`:默认 `请只回复 OK`
79
+ - `PUBLIC_HISTORY_HOURS`:默认 `48`
80
+ - `MAX_UPSTREAM_CONNECTIONS`:默认 `256`
81
+ - `MAX_KEEPALIVE_CONNECTIONS`:默认 `64`
82
+ - `DATABASE_PATH`默认 `./data.sqlite3`
83
+
84
+ 示例配置见 `.env.example`
85
+
86
+ ## 本地运行
87
+
88
+ 安装运行依赖
89
+
90
+ ```bash
91
+ pip install -r requirements.txt
92
+ ```
93
+
94
+ 如需本地联调与 smoke test:
95
+
96
+ ```bash
97
+ pip install -r requirements-dev.txt
98
+ python scripts/local_smoke_test.py
99
+ ```
100
+
101
+ 启动服务:
102
+
103
+ ```bash
104
+ uvicorn app.main:app --host 0.0.0.0 --port 7860
105
+ ```
106
+
107
+ ## 部署到 Hugging Face Space
108
+
109
+ 这个仓库已经按 Docker Space 准备好了部署文件。
110
+
111
+ 1. 新建一个 Hugging Face Space,SDK 选择 `Docker`
112
+ 2. `hf_space` 目录内的内容作为 Space 根目录上传
113
+ 3. 在 Space Secrets 中至少配置 `PASSWORD`、`PASS_APIKEY` 和一个 NVIDIA NIM Key
114
+ 4. 打开 `/admin`,确认 Key 用,并执行一次巡检
115
+
116
+ ##地验证情况
117
+
118
+ 我已经通过本地 smoke test 验证了以下链路:
119
+
120
+ - 中文首页与中文后台页面可正常返回
121
+ - HTML 响应头包含 `charset=utf-8`
122
+ - `/v1/responses` 鉴权正常
123
+ - `/v1/responses` 文本回复转换正常
124
+ - tool call / function call 转换正常
125
+ - `function_call_output` 回灌到上游消息格式正常
126
+ - `previous_response_id` 上下文拼接正常
127
+ - 多个 NIM Key 轮询分发正常
128
+ - 并发请求转发正常
129
+ - 后台登录、手动巡检、公开健康页同步正常
130
+
131
+ ## 参考资料
132
+
133
+ - OpenAI Responses API: https://platform.openai.com/docs/guides/responses-vs-chat-completions
134
+ - OpenAI Function Calling: https://platform.openai.com/docs/guides/function-calling
135
+ - NVIDIA Build: https://build.nvidia.com/
136
  - NVIDIA NIM API 文档: https://docs.api.nvidia.com/
app/__pycache__/main.cpython-313.pyc CHANGED
Binary files a/app/__pycache__/main.cpython-313.pyc and b/app/__pycache__/main.cpython-313.pyc differ
 
app/main.py CHANGED
The diff for this file is too large to render. See raw diff
 
static/admin.html CHANGED
@@ -68,8 +68,11 @@
68
  <span class="section-tag">目录配置</span>
69
  <h2>模型管理</h2>
70
  </div>
71
- <p class="status-text">添加、删除、连通性测试,以及使用与巡检统计。</p>
 
 
72
  </div>
 
73
  <div class="section-grid compact-grid">
74
  <div class="metric-card">
75
  <h3>模型总数</h3>
@@ -106,8 +109,11 @@
106
  <span class="section-tag">凭据配置</span>
107
  <h2>NVIDIA NIM Key 管理</h2>
108
  </div>
109
- <p class="status-text">统一维护可用 Key,并统计请求和巡检使用情况。</p>
 
 
110
  </div>
 
111
  <div class="form-grid compact-grid">
112
  <input id="key-label" placeholder="Key 名称,例如 主生产 Key" />
113
  <input id="key-value" placeholder="输入 NVIDIA NIM Key" />
@@ -134,7 +140,7 @@
134
  <span class="section-tag">健康巡检</span>
135
  <h2>巡检记录</h2>
136
  </div>
137
- <button id="run-healthcheck" type="button">立即执行巡检</button>
138
  </div>
139
  <p class="status-text">手动触发的巡检结果会立刻写入数据库,并同步更新到公开健康页。</p>
140
  <div class="section-grid" id="health-grid"></div>
@@ -148,15 +154,15 @@
148
  </div>
149
  <p class="status-text">设置巡检开关、时间间隔、公开页保留时长和巡检提示词。</p>
150
  </div>
151
- <div class="form-grid">
152
- <label class="checkbox-row">
153
  <input id="healthcheck-enabled" type="checkbox" />
154
  <span>启用定时健康巡检</span>
155
  </label>
156
  <input id="healthcheck-interval" type="number" min="5" step="5" placeholder="巡检间隔,单位分钟" />
157
  <input id="public-history-hours" type="number" min="1" step="1" placeholder="公开页保留时长,单位小时" />
158
  <textarea id="healthcheck-prompt" placeholder="用于健康巡检的提示词"></textarea>
159
- <div class="inline-actions">
160
  <button id="settings-save" type="button">保存设置</button>
161
  <button class="secondary-btn" id="refresh-now" type="button">重新加载面板</button>
162
  </div>
 
68
  <span class="section-tag">目录配置</span>
69
  <h2>模型管理</h2>
70
  </div>
71
+ <div class="inline-actions panel-actions">
72
+ <button class="secondary-btn" id="test-all-models" type="button">测试全部模型</button>
73
+ </div>
74
  </div>
75
+ <p class="status-text">添加、删除、连通性测试,以及使用与巡检统计。</p>
76
  <div class="section-grid compact-grid">
77
  <div class="metric-card">
78
  <h3>模型总数</h3>
 
109
  <span class="section-tag">凭据配置</span>
110
  <h2>NVIDIA NIM Key 管理</h2>
111
  </div>
112
+ <div class="inline-actions panel-actions">
113
+ <button class="secondary-btn" id="test-all-keys" type="button">测试全部 Key</button>
114
+ </div>
115
  </div>
116
+ <p class="status-text">统一维护可用 Key,并统计请求和巡检使用情况。</p>
117
  <div class="form-grid compact-grid">
118
  <input id="key-label" placeholder="Key 名称,例如 主生产 Key" />
119
  <input id="key-value" placeholder="输入 NVIDIA NIM Key" />
 
140
  <span class="section-tag">健康巡检</span>
141
  <h2>巡检记录</h2>
142
  </div>
143
+ <button id="run-healthcheck" type="button">立即巡检全部模型</button>
144
  </div>
145
  <p class="status-text">手动触发的巡检结果会立刻写入数据库,并同步更新到公开健康页。</p>
146
  <div class="section-grid" id="health-grid"></div>
 
154
  </div>
155
  <p class="status-text">设置巡检开关、时间间隔、公开页保留时长和巡检提示词。</p>
156
  </div>
157
+ <div class="form-grid settings-grid">
158
+ <label class="checkbox-row field-span-full">
159
  <input id="healthcheck-enabled" type="checkbox" />
160
  <span>启用定时健康巡检</span>
161
  </label>
162
  <input id="healthcheck-interval" type="number" min="5" step="5" placeholder="巡检间隔,单位分钟" />
163
  <input id="public-history-hours" type="number" min="1" step="1" placeholder="公开页保留时长,单位小时" />
164
  <textarea id="healthcheck-prompt" placeholder="用于健康巡检的提示词"></textarea>
165
+ <div class="inline-actions settings-actions field-span-full">
166
  <button id="settings-save" type="button">保存设置</button>
167
  <button class="secondary-btn" id="refresh-now" type="button">重新加载面板</button>
168
  </div>
static/admin.js CHANGED
@@ -12,6 +12,9 @@ const healthGrid = document.getElementById("health-grid");
12
  const modelCount = document.getElementById("model-count");
13
  const modelHealthy = document.getElementById("model-healthy");
14
  const settingsStatus = document.getElementById("settings-status");
 
 
 
15
 
16
  const state = {
17
  token: sessionStorage.getItem("nim_token"),
@@ -51,7 +54,10 @@ sidebarButtons.forEach((button) => button.addEventListener("click", () => showPa
51
  async function apiRequest(endpoint, opts = {}) {
52
  const headers = { "Content-Type": "application/json", Accept: "application/json" };
53
  if (state.token) headers.Authorization = `Bearer ${state.token}`;
54
- const response = await fetch(`/admin/api/${endpoint}`, { ...opts, headers: { ...headers, ...(opts.headers || {}) } });
 
 
 
55
  if (!response.ok) {
56
  const payload = await response.json().catch(() => ({}));
57
  throw new Error(payload.message || payload.detail || payload.error?.message || "请求失败");
@@ -192,6 +198,24 @@ async function loadAll() {
192
  await Promise.all([renderOverview(), renderModels(), renderKeys(), renderHealth(), renderSettings()]);
193
  }
194
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
195
  async function testModel(modelId) {
196
  const payload = await apiRequest(`models/${encodeURIComponent(modelId)}/test`, { method: "POST", body: JSON.stringify({}) });
197
  alert(`${payload.display_name || payload.model} 当前状态:${STATUS_LABELS[payload.status] || payload.status}`);
@@ -259,10 +283,9 @@ document.getElementById("key-add")?.addEventListener("click", async () => {
259
  await renderKeys();
260
  });
261
 
262
- document.getElementById("run-healthcheck")?.addEventListener("click", async () => {
263
- await apiRequest("healthchecks/run", { method: "POST", body: JSON.stringify({}) });
264
- await loadAll();
265
- });
266
 
267
  document.getElementById("settings-save")?.addEventListener("click", async () => {
268
  try {
@@ -318,4 +341,3 @@ window.addEventListener("DOMContentLoaded", async () => {
318
  loginOverlay.classList.remove("hidden");
319
  }
320
  });
321
-
 
12
  const modelCount = document.getElementById("model-count");
13
  const modelHealthy = document.getElementById("model-healthy");
14
  const settingsStatus = document.getElementById("settings-status");
15
+ const testAllModelsBtn = document.getElementById("test-all-models");
16
+ const testAllKeysBtn = document.getElementById("test-all-keys");
17
+ const runHealthcheckBtn = document.getElementById("run-healthcheck");
18
 
19
  const state = {
20
  token: sessionStorage.getItem("nim_token"),
 
54
  async function apiRequest(endpoint, opts = {}) {
55
  const headers = { "Content-Type": "application/json", Accept: "application/json" };
56
  if (state.token) headers.Authorization = `Bearer ${state.token}`;
57
+ const response = await fetch(`/admin/api/${endpoint}`, {
58
+ ...opts,
59
+ headers: { ...headers, ...(opts.headers || {}) },
60
+ });
61
  if (!response.ok) {
62
  const payload = await response.json().catch(() => ({}));
63
  throw new Error(payload.message || payload.detail || payload.error?.message || "请求失败");
 
198
  await Promise.all([renderOverview(), renderModels(), renderKeys(), renderHealth(), renderSettings()]);
199
  }
200
 
201
+ async function runAllModelChecks() {
202
+ const payload = await apiRequest("healthchecks/run", { method: "POST", body: JSON.stringify({}) });
203
+ const items = payload.items || [];
204
+ const success = items.filter((item) => item.status === "healthy").length;
205
+ alert(`已完成全部模型巡检,共 ${items.length} 个模型,其中 ${success} 个正常。`);
206
+ showPanel("health");
207
+ await loadAll();
208
+ }
209
+
210
+ async function runAllKeyChecks() {
211
+ const payload = await apiRequest("keys/test-all", { method: "POST", body: JSON.stringify({}) });
212
+ const items = payload.items || [];
213
+ const success = items.filter((item) => item.status === "healthy").length;
214
+ alert(`已完成全部 Key 测试,共 ${items.length} 个 Key,其中 ${success} 个正常。`);
215
+ showPanel("keys");
216
+ await loadAll();
217
+ }
218
+
219
  async function testModel(modelId) {
220
  const payload = await apiRequest(`models/${encodeURIComponent(modelId)}/test`, { method: "POST", body: JSON.stringify({}) });
221
  alert(`${payload.display_name || payload.model} 当前状态:${STATUS_LABELS[payload.status] || payload.status}`);
 
283
  await renderKeys();
284
  });
285
 
286
+ testAllModelsBtn?.addEventListener("click", runAllModelChecks);
287
+ testAllKeysBtn?.addEventListener("click", runAllKeyChecks);
288
+ runHealthcheckBtn?.addEventListener("click", runAllModelChecks);
 
289
 
290
  document.getElementById("settings-save")?.addEventListener("click", async () => {
291
  try {
 
341
  loginOverlay.classList.remove("hidden");
342
  }
343
  });
 
static/style.css CHANGED
@@ -720,3 +720,56 @@ button {
720
  grid-template-columns: 1fr;
721
  }
722
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
720
  grid-template-columns: 1fr;
721
  }
722
  }
723
+ .panel-actions {
724
+ align-items: center;
725
+ }
726
+
727
+ .settings-grid {
728
+ align-items: start;
729
+ }
730
+
731
+ .field-span-full {
732
+ grid-column: 1 / -1;
733
+ }
734
+
735
+ .settings-grid .checkbox-row {
736
+ min-height: 58px;
737
+ padding: 14px 16px;
738
+ border-radius: 16px;
739
+ border: 1px solid rgba(255, 255, 255, 0.1);
740
+ background: rgba(255, 255, 255, 0.045);
741
+ }
742
+
743
+ .settings-grid .checkbox-row input {
744
+ width: 18px;
745
+ min-width: 18px;
746
+ height: 18px;
747
+ padding: 0;
748
+ margin: 0;
749
+ border-radius: 6px;
750
+ background: rgba(255, 255, 255, 0.02);
751
+ box-shadow: none;
752
+ }
753
+
754
+ .settings-actions {
755
+ justify-content: flex-start;
756
+ align-items: center;
757
+ }
758
+
759
+ .form-grid > button {
760
+ min-height: 54px;
761
+ justify-self: start;
762
+ }
763
+
764
+ @media (max-width: 720px) {
765
+ .settings-actions {
766
+ flex-direction: column;
767
+ align-items: stretch;
768
+ }
769
+
770
+ .settings-actions button,
771
+ .panel-actions button,
772
+ .form-grid > button {
773
+ width: 100%;
774
+ }
775
+ }