Fix: Restore YAML frontmatter for Gradio SDK configuration
Browse files
README.md
CHANGED
|
@@ -1,7 +1,17 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
tags:
|
| 3 |
- ml-intern
|
| 4 |
---
|
|
|
|
| 5 |
# 🏥 保险APP 用户行为分析模型训练平台 v3.0
|
| 6 |
|
| 7 |
基于最新研究论文构建的**工业级保险用户行为分析平台**,支持**7大功能模块**:演示模式、CSV上传、产品推荐、异常检测、模型管理、生存分析、帮助文档。
|
|
@@ -39,20 +49,6 @@ tags:
|
|
| 39 |
- **输出**: 购买概率 + 注意力权重可视化
|
| 40 |
- ** insight**: 用户兴趣表示随候选产品动态变化
|
| 41 |
|
| 42 |
-
```
|
| 43 |
-
用户历史: [event_1, product_1], [event_2, product_2], ...
|
| 44 |
-
↓ Embedding
|
| 45 |
-
事件嵌入(D/2) + 产品嵌入(D/2) → 行为嵌入(D)
|
| 46 |
-
↓
|
| 47 |
-
候选产品 Embedding ───┐
|
| 48 |
-
↓
|
| 49 |
-
[c, b, c-b, c*b] → Attention MLP → 权重 α
|
| 50 |
-
↓
|
| 51 |
-
加权求和 → 兴趣向量(D)
|
| 52 |
-
↓
|
| 53 |
-
[用户, 兴趣, 候选, 交互, 统计特征] → MLP → 购买概率
|
| 54 |
-
```
|
| 55 |
-
|
| 56 |
### 3. 异常行为检测 (TabBERT)
|
| 57 |
**方法**: TabularBERT + Focal Loss
|
| 58 |
- **核心**: 层次化Transformer + 不平衡数据处理
|
|
@@ -125,14 +121,6 @@ pip install -r requirements.txt
|
|
| 125 |
python app.py
|
| 126 |
```
|
| 127 |
|
| 128 |
-
### Docker 运行
|
| 129 |
-
|
| 130 |
-
```bash
|
| 131 |
-
docker run -p 7860:7860 --platform=linux/amd64 \
|
| 132 |
-
-e HF_TOKEN="your_token" \
|
| 133 |
-
registry.hf.space/stephanwu-insurance-app-behavior:latest
|
| 134 |
-
```
|
| 135 |
-
|
| 136 |
---
|
| 137 |
|
| 138 |
## ⚠️ 不平衡数据处理
|
|
@@ -161,73 +149,6 @@ Stephanwu/insurance-app-behavior/
|
|
| 161 |
|
| 162 |
---
|
| 163 |
|
| 164 |
-
## 🧠 模型架构详解
|
| 165 |
-
|
| 166 |
-
### DIN (Deep Interest Network)
|
| 167 |
-
```python
|
| 168 |
-
# LocalActivationUnit 核心
|
| 169 |
-
candidate_emb = embed(candidate_product) # (B, D)
|
| 170 |
-
behavior_emb = embed(events) + embed(products) # (B, L, D)
|
| 171 |
-
|
| 172 |
-
# 4路交互特征
|
| 173 |
-
interaction = concat([
|
| 174 |
-
candidate_emb, # 候选产品
|
| 175 |
-
behavior_emb, # 历史行为
|
| 176 |
-
candidate - behavior, # 差异
|
| 177 |
-
candidate * behavior, # 点积
|
| 178 |
-
]) # (B, L, 4D)
|
| 179 |
-
|
| 180 |
-
# 注意力权重
|
| 181 |
-
attention_weights = MLP(interaction) # (B, L)
|
| 182 |
-
attention_weights = softmax(attention_weights)
|
| 183 |
-
|
| 184 |
-
# 加权兴趣
|
| 185 |
-
interest = sum(behavior_emb * attention_weights) # (B, D)
|
| 186 |
-
|
| 187 |
-
# 预测
|
| 188 |
-
logits = MLP(concat([user, interest, candidate, interaction, stats]))
|
| 189 |
-
```
|
| 190 |
-
|
| 191 |
-
### TabBERT (简化版)
|
| 192 |
-
```python
|
| 193 |
-
# 层次化Transformer
|
| 194 |
-
input_features = [claim_amount, claim_type, days_since_policy, ...]
|
| 195 |
-
↓
|
| 196 |
-
Linear Projection: d_model (128)
|
| 197 |
-
↓
|
| 198 |
-
┌────────────────────────┐
|
| 199 |
-
│ Transformer × 4 │ # 模拟 Field + Sequence level
|
| 200 |
-
│ LayerNorm + Dropout │
|
| 201 |
-
└────────────────────────┘
|
| 202 |
-
↓
|
| 203 |
-
Global Average Pooling
|
| 204 |
-
↓
|
| 205 |
-
MLP: 128 → 256 → 64 → 1
|
| 206 |
-
↓
|
| 207 |
-
Focal Loss (解决1:4不平衡)
|
| 208 |
-
```
|
| 209 |
-
|
| 210 |
-
### DeepSurv (Neural Cox-PH)
|
| 211 |
-
```python
|
| 212 |
-
# Cox partial likelihood loss
|
| 213 |
-
def cox_ph_loss(pred, time, event):
|
| 214 |
-
# Sort by time descending
|
| 215 |
-
pred_sorted = pred[argsort(time, descending=True)]
|
| 216 |
-
event_sorted = event[argsort(time, descending=True)]
|
| 217 |
-
|
| 218 |
-
# logcumsumexp for numerical stability
|
| 219 |
-
log_cumsum_h = logcumsumexp(pred_sorted)
|
| 220 |
-
|
| 221 |
-
# Only event samples contribute
|
| 222 |
-
loss = -sum(event * (pred - log_cumsum_h)) / sum(event)
|
| 223 |
-
return loss
|
| 224 |
-
|
| 225 |
-
# Survival probability
|
| 226 |
-
S(t | x) = exp(-H_0(t) * exp(pred(x)))
|
| 227 |
-
```
|
| 228 |
-
|
| 229 |
-
---
|
| 230 |
-
|
| 231 |
## 📚 参考文献
|
| 232 |
|
| 233 |
| 论文 | 应用 | arXiv | 会议 |
|
|
|
|
| 1 |
---
|
| 2 |
+
sdk: gradio
|
| 3 |
+
title: Insurance App Behavior
|
| 4 |
+
emoji: 🏥
|
| 5 |
+
colorFrom: purple
|
| 6 |
+
colorTo: gray
|
| 7 |
+
sdk_version: 6.14.0
|
| 8 |
+
python_version: '3.13'
|
| 9 |
+
app_file: app.py
|
| 10 |
+
pinned: false
|
| 11 |
tags:
|
| 12 |
- ml-intern
|
| 13 |
---
|
| 14 |
+
|
| 15 |
# 🏥 保险APP 用户行为分析模型训练平台 v3.0
|
| 16 |
|
| 17 |
基于最新研究论文构建的**工业级保险用户行为分析平台**,支持**7大功能模块**:演示模式、CSV上传、产品推荐、异常检测、模型管理、生存分析、帮助文档。
|
|
|
|
| 49 |
- **输出**: 购买概率 + 注意力权重可视化
|
| 50 |
- ** insight**: 用户兴趣表示随候选产品动态变化
|
| 51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
### 3. 异常行为检测 (TabBERT)
|
| 53 |
**方法**: TabularBERT + Focal Loss
|
| 54 |
- **核心**: 层次化Transformer + 不平衡数据处理
|
|
|
|
| 121 |
python app.py
|
| 122 |
```
|
| 123 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 124 |
---
|
| 125 |
|
| 126 |
## ⚠️ 不平衡数据处理
|
|
|
|
| 149 |
|
| 150 |
---
|
| 151 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 152 |
## 📚 参考文献
|
| 153 |
|
| 154 |
| 论文 | 应用 | arXiv | 会议 |
|