Upload folder using huggingface_hub

Browse files

Files changed (5) hide show

README.md +70 -69
config.json +28 -21
model.pt +2 -2
model.py +35 -286
norm_stats.npz +3 -0

README.md CHANGED Viewed

@@ -1,105 +1,106 @@
-# LOBPatternNet - 主力下单模式识别模型
-# LOBPatternNet - Institutional Trading Pattern Detection from Level-2 Order Book
-## 模型简介 / Model Overview
-本模型基于A股Level-2十档委托单数据,利用深度学习自动识别主力（机构投资者）的下单模式。
-通过分析买卖委托的价格分布、挂单量、订单流不平衡(OFI)等微观结构特征,判断当前是否存在主力买入或卖出行为。
-This model detects institutional (主力) trading patterns from Level-2 order book data with 10 price levels.
-It analyzes bid/ask price distributions, order sizes, Order Flow Imbalance (OFI), and other microstructure
-features to classify market states into institutional buying, neutral, or institutional selling.
 ## 架构 / Architecture
 ```
-Input: (batch, 100, 40) - 100 consecutive LOB snapshots × 40 features
-    ↓
-BilinearNorm - 自适应归一化层
-    ↓
-Spatial CNN (Conv2d) - 提取价位间空间特征 (cross-level patterns)
-    ↓
-Inception Module × 2 - 多尺度时间特征提取 (multi-scale temporal)
-    ↓
-Transformer Attention × 2 - 时序依赖建模 (temporal dependencies)
-    ↓
-Fusion with Auxiliary Features:
-    - 订单流不平衡 (OFI)
-    - 价差动态 (Spread dynamics)
-    - 深度不平衡 (Depth imbalance)
-    - 大单集中度 (Volume concentration)
-    - 价格压力 (Price pressure)
-    - OFI波动率 (OFI volatility)
-    ↓
-3-class Classification Head
 ```
-**Total Parameters**: 259,899
-## 输出类别 / Output Classes
-| Label | 中文 | English | Description |
-|-------|------|---------|-------------|
-| 0 | 主力买入 | Institutional Buying | 检测到机构大量买入信号 |
-| 1 | 中性/散户 | Neutral/Retail | 无明显主力操盘迹象 |
-| 2 | 主力卖出 | Institutional Selling | 检测到机构大量卖出信号 |
-## 性能指标 / Performance
 | Metric | Value |
 |--------|-------|
-| Test Accuracy | 0.4777 |
-| Test F1 (Macro) | 0.4127 |
-| Test F1 (Weighted) | 0.5091 |
-| 主力买入 Precision | 0.2369 |
-| 主力买入 Recall | 0.4251 |
-| 主力卖出 Precision | 0.2679 |
-| 主力卖出 Recall | 0.4983 |
 ## 使用方法 / Usage
 ```python
 import torch
-from model import LOBPatternNet
 # Load model
-model = LOBPatternNet(seq_len=100, num_classes=3, d_model=128, nhead=4, num_attn_layers=2)
 model.load_state_dict(torch.load("model.pt", weights_only=True))
 model.eval()
-# Input: 100 consecutive Level-2 snapshots
-# Each snapshot: [ask_p1, ask_s1, bid_p1, bid_s1, ask_p2, ask_s2, ..., bid_p10, bid_s10]
-# Features should be z-score normalized (see data_processor.py)
-x = torch.randn(1, 100, 40)  # example input
 with torch.no_grad():
     logits = model(x)
     probs = torch.softmax(logits, dim=1)
-    pred = logits.argmax(dim=1)
-# pred: 0=主力买入, 1=中性, 2=主力卖出
-labels = ["主力买入", "中性/散户", "主力卖出"]
-print(f"Prediction: {labels[pred.item()]}")
-print(f"Confidence: {probs[0, pred.item()]:.2%}")
-```
-## 数据格式 / Input Format
-每个Level-2快照包含40个特征 (10档 × 4个字段):
-| Feature | Description | 说明 |
-|---------|-------------|------|
-| ask_price_i | Ask price at level i | 第i档卖出价 |
-| ask_size_i | Ask volume at level i | 第i档卖出量 |
-| bid_price_i | Bid price at level i | 第i档买入价 |
-| bid_size_i | Bid volume at level i | 第i档买入量 |
-## 参考文献 / References
-- **DeepLOB**: Zhang et al., "Deep Convolutional Neural Networks for Limit Order Books", TNNLS 2019 (arxiv:1808.03668)
-- **TLOB**: Berti & Kasneci, "TLOB: A Novel Transformer Model with Dual Attention for Stock Price Trend Prediction", 2025 (arxiv:2502.15757)
-- **Training Dataset**: [LeonardoBerti/TRADES-LOB](https://huggingface.co/datasets/LeonardoBerti/TRADES-LOB)
 ## 声明 / Disclaimer
-本模型仅供研究学习使用,不构成任何投资建议。股市有风险,入市需谨慎。
-This model is for research purposes only and does not constitute investment advice.

+---
+tags:
+- finance
+- order-book
+- institutional-trading
+- level-2
+- A-share
+- LOB
+- pytorch
+license: mit
+---
+# LOBPatternNet V3 - 主力下单模式识别模型
+## 模型简介 / Overview
+基于A股Level-2十档委托单(LOB)数据，利用深度学习自动识别主力（机构）的下单模式。
+Detects institutional trading patterns from Level-2 order book data (10-level bid/ask).
 ## 架构 / Architecture
 ```
+Input: (batch, 100, 40) - 100 consecutive LOB snapshots
+       Each snapshot: [ask_p₁, ask_s₁, bid_p₁, bid_s₁, ..., ask_p₁₀, ask_s₁₀, bid_p₁₀, bid_s₁₀]
+  ↓ BilinearNorm (adaptive normalization)
+  ↓ Spatial CNN (cross-level patterns)
+  ↓ Temporal CNN (multi-scale time features)
+  ↓ Transformer Attention (temporal dependencies)
+  ↓ 3-class Classification
 ```
+Parameters: 85,803
+## 输出 / Output Classes
+| ID | 中文 | English |
+|----|------|---------|
+| 0 | 主力买入 | Institutional Buying |
+| 1 | 中性/散户 | Neutral / Retail |
+| 2 | 主力卖出 | Institutional Selling |
+## 性能 / Performance
 | Metric | Value |
 |--------|-------|
+| Test Accuracy | 0.1579 |
+| Test F1 (Macro) | 0.1634 |
+| Test F1 (Weighted) | 0.0725 |
+| 主力买入 Precision | 0.1306 |
+| 主力买入 Recall | 0.4739 |
+| 主力卖出 Precision | 0.1876 |
+| 主力卖出 Recall | 0.5947 |
 ## 使用方法 / Usage
 ```python
 import torch
+import numpy as np
+from model import LOBPatternNetV3
 # Load model
+model = LOBPatternNetV3(num_classes=3, d_model=64, nhead=4, dropout=0.4)
 model.load_state_dict(torch.load("model.pt", weights_only=True))
 model.eval()
+# Load normalization stats
+stats = np.load("norm_stats.npz")
+means, stds = stats["means"], stats["stds"]
+# Prepare input: 100 consecutive Level-2 snapshots (N, 40)
+# Each snapshot: [ask_price_1, ask_size_1, bid_price_1, bid_size_1, ...]
+# 1. Replace sentinel values (abs > 1e9) with 0
+# 2. Normalize prices to basis points relative to mid-price
+# 3. Log-transform sizes with log1p
+# 4. Z-score normalize using means/stds
+raw_data = ...  # your (100, 40) LOB snapshot array
+normalized = (raw_data - means) / stds
+x = torch.from_numpy(normalized).unsqueeze(0).float()
 with torch.no_grad():
     logits = model(x)
     probs = torch.softmax(logits, dim=1)
+    pred = logits.argmax(dim=1).item()
+labels = ["主力买入 (Institutional Buy)", "中性 (Neutral)", "主力卖出 (Institutional Sell)"]
+print(f"预测: {labels[pred]}, 置信度: {probs[0, pred]:.1%}")
+```
+## 训练细节 / Training Details
+- **Dataset**: [LeonardoBerti/TRADES-LOB](https://huggingface.co/datasets/LeonardoBerti/TRADES-LOB) (265K order events, 10-level LOB)
+- **Label Construction**: Order Flow Imbalance (OFI) + Large Order Ratio + Cancellation Rate
+- **Loss**: Focal Loss (γ=2.0) + Label Smoothing (0.1) + Class Weighting
+- **Regularization**: Dropout 0.4, Weight Decay 5e-4, Mixup Augmentation (α=0.3)
+- **Optimizer**: AdamW, lr=3e-4, Cosine Annealing with Warm Restarts
+## 参考 / References
+- DeepLOB: Zhang et al., TNNLS 2019 (arxiv:1808.03668)
+- TLOB: Berti & Kasneci, 2025 (arxiv:2502.15757)
 ## 声明 / Disclaimer
+本模型仅供研究学习使用，不构成任何投资建议。股市有风险，入市需谨慎。
+This model is for research purposes only. Not investment advice.

config.json CHANGED Viewed

@@ -1,36 +1,43 @@
 {
-  "model_type": "LOBPatternNet",
-  "architecture": "CNN + Inception + Transformer Attention + Auxiliary Features",
   "num_levels": 10,
   "seq_len": 100,
   "num_classes": 3,
-  "d_model": 128,
   "nhead": 4,
-  "num_attn_layers": 2,
-  "dropout": 0.2,
   "class_names": [
-    "主力买入 (Institutional Buy)",
     "中性 (Neutral)",
-    "主力卖出 (Institutional Sell)"
   ],
   "class_names_zh": [
     "主力买入",
     "中性/散户",
     "主力卖出"
   ],
-  "total_parameters": 259899,
-  "training_dataset": "LeonardoBerti/TRADES-LOB",
-  "test_accuracy": 0.47769423558897245,
-  "test_f1_macro": 0.4126581408122072,
-  "test_f1_weighted": 0.5091308416210424,
-  "test_precision_per_class": [
-    0.23689320388349513,
-    0.7402173913043478,
-    0.26785714285714285
   ],
-  "test_recall_per_class": [
-    0.4250871080139373,
-    0.4840085287846482,
-    0.4983388704318937
-  ]
 }

 {
+  "model_type": "LOBPatternNetV3",
+  "architecture": "CNN (Spatial) + CNN (Temporal) + Transformer Attention",
   "num_levels": 10,
   "seq_len": 100,
   "num_classes": 3,
+  "d_model": 64,
   "nhead": 4,
+  "dropout": 0.4,
+  "total_parameters": 85803,
   "class_names": [
+    "主力买入 (Buy)",
     "中性 (Neutral)",
+    "主力卖出 (Sell)"
   ],
   "class_names_zh": [
     "主力买入",
     "中性/散户",
     "主力卖出"
   ],
+  "test_accuracy": 0.15789473684210525,
+  "test_f1_macro": 0.16335941375062,
+  "test_f1_weighted": 0.07250430112144952,
+  "test_precision": [
+    0.13064361191162344,
+    0.0,
+    0.18763102725366876
+  ],
+  "test_recall": [
+    0.4738675958188153,
+    0.0,
+    0.5946843853820598
   ],
+  "training_dataset": "LeonardoBerti/TRADES-LOB",
+  "normalization": "z-score (means/stds in norm_stats.npz)",
+  "label_construction": {
+    "method": "OFI + large_order_ratio + cancellation_rate",
+    "window": 50,
+    "ofi_threshold": 0.15,
+    "large_order_percentile": 85,
+    "score_percentile": 80
+  }
 }

model.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:aa391467f5bc207ba527cda22072d606488d8e3cb07b10e60512451e7bc8733b
-size 1073163

 version https://git-lfs.github.com/spec/v1
+oid sha256:b8cba3876b1f6e97f0a5c424e4313e38fd8a83e5f5cb5550d2a0d55bd1d56feb
+size 366176

model.py CHANGED Viewed

@@ -1,311 +1,60 @@
-"""
-LOBPatternNet: Deep Learning Model for Institutional Trading Pattern Detection
-from Level-2 Order Book Data (10-level bid/ask)
-Architecture: CNN (spatial) + Inception (multi-scale) + Transformer Attention (temporal) + MLP Head
-Based on DeepLOB (Zhang et al. 2019) + TLOB (Berti & Kasneci 2025) design principles
-Input: (batch, seq_len, 40) - seq_len consecutive LOB snapshots, each with 40 features:
-    [ask_price_1..10, ask_size_1..10, bid_price_1..10, bid_size_1..10]
-Output: 3-class classification
-    0: 主力买入 (Institutional Buying)
-    1: 中性/散户 (Neutral/Retail)
-    2: 主力卖出 (Institutional Selling)
-"""
 import torch
 import torch.nn as nn
-import torch.nn.functional as F
-import math
 class BilinearNorm(nn.Module):
-    """Bilinear normalization layer from TLOB - handles price/volume scale mismatch."""
     def __init__(self, num_features):
         super().__init__()
         self.gamma = nn.Parameter(torch.ones(1, 1, num_features))
         self.beta = nn.Parameter(torch.zeros(1, 1, num_features))
         self.gate = nn.Parameter(torch.ones(1, 1, num_features))
     def forward(self, x):
-        # x: (B, T, F)
         mean = x.mean(dim=1, keepdim=True)
         std = x.std(dim=1, keepdim=True) + 1e-8
         x_norm = (x - mean) / std
         gate = torch.sigmoid(self.gate)
         return gate * (self.gamma * x_norm + self.beta) + (1 - gate) * x
-class InceptionModule(nn.Module):
-    """Inception module for multi-scale temporal feature extraction."""
-    def __init__(self, in_channels, out_channels=32):
-        super().__init__()
-        self.branch1 = nn.Sequential(
-            nn.Conv1d(in_channels, out_channels, kernel_size=1),
-            nn.BatchNorm1d(out_channels),
-            nn.LeakyReLU(0.01)
-        )
-        self.branch3 = nn.Sequential(
-            nn.Conv1d(in_channels, out_channels, kernel_size=3, padding=1),
-            nn.BatchNorm1d(out_channels),
-            nn.LeakyReLU(0.01)
-        )
-        self.branch5 = nn.Sequential(
-            nn.Conv1d(in_channels, out_channels, kernel_size=5, padding=2),
-            nn.BatchNorm1d(out_channels),
-            nn.LeakyReLU(0.01)
-        )
-        self.pool_branch = nn.Sequential(
-            nn.MaxPool1d(kernel_size=3, stride=1, padding=1),
-            nn.Conv1d(in_channels, out_channels, kernel_size=1),
-            nn.BatchNorm1d(out_channels),
-            nn.LeakyReLU(0.01)
-        )
-    def forward(self, x):
-        # x: (B, C, T)
-        return torch.cat([self.branch1(x), self.branch3(x),
-                         self.branch5(x), self.pool_branch(x)], dim=1)
-class TemporalAttention(nn.Module):
-    """Multi-head self-attention for temporal dependencies in order flow."""
-    def __init__(self, d_model, nhead=4, dropout=0.1):
-        super().__init__()
-        self.attn = nn.MultiheadAttention(d_model, nhead, dropout=dropout, batch_first=True)
-        self.norm = nn.LayerNorm(d_model)
-        self.dropout = nn.Dropout(dropout)
-    def forward(self, x):
-        # x: (B, T, D)
-        attn_out, _ = self.attn(x, x, x)
-        return self.norm(x + self.dropout(attn_out))
-class LOBPatternNet(nn.Module):
-    """
-    Full model for institutional trading pattern detection from Level-2 LOB data.
-    Architecture:
-    1. BilinearNorm → normalize raw LOB features
-    2. CNN spatial encoder → extract cross-level order book patterns
-    3. Inception → multi-scale temporal features
-    4. Transformer attention → capture temporal dependencies
-    5. Classification head → 3-class output
-    """
-    def __init__(self,
-                 num_levels=10,        # number of price levels (10 for Level-2)
-                 seq_len=100,          # number of consecutive LOB snapshots
-                 num_classes=3,        # 主力买入, 中性, 主力卖出
-                 d_model=128,          # internal feature dimension
-                 nhead=4,              # attention heads
-                 num_attn_layers=2,    # number of attention layers
-                 dropout=0.2):
         super().__init__()
-        self.num_levels = num_levels
-        self.seq_len = seq_len
-        self.num_features = num_levels * 4  # 40 features: ask_p, ask_s, bid_p, bid_s × 10 levels
-        # 1. Bilinear normalization
-        self.norm = BilinearNorm(self.num_features)
-        # 2. Spatial CNN encoder - processes each snapshot across price levels
-        # Reshape to (B, 1, T, 40) for 2D conv
-        self.spatial_cnn = nn.Sequential(
-            # Conv across features (price-volume pairing per level)
-            nn.Conv2d(1, 32, kernel_size=(1, 2), stride=(1, 2)),   # (B, 32, T, 20)
-            nn.BatchNorm2d(32),
-            nn.LeakyReLU(0.01),
-            nn.Conv2d(32, 32, kernel_size=(1, 2), stride=(1, 2)),  # (B, 32, T, 10)
-            nn.BatchNorm2d(32),
-            nn.LeakyReLU(0.01),
-            nn.Conv2d(32, 32, kernel_size=(1, 10)),                # (B, 32, T, 1)
-            nn.BatchNorm2d(32),
-            nn.LeakyReLU(0.01),
-        )
-        # 3. Inception module for multi-scale temporal features
-        self.inception1 = InceptionModule(32, 32)    # Output: 128 channels
-        self.inception2 = InceptionModule(128, 32)   # Output: 128 channels
-        # 4. Projection to d_model
-        self.proj = nn.Sequential(
-            nn.Linear(128, d_model),
-            nn.LayerNorm(d_model),
-            nn.LeakyReLU(0.01),
-            nn.Dropout(dropout)
-        )
-        # 5. Transformer attention layers
-        self.attention_layers = nn.ModuleList([
-            TemporalAttention(d_model, nhead, dropout)
-            for _ in range(num_attn_layers)
-        ])
-        # 6. Classification head
         self.classifier = nn.Sequential(
-            nn.Linear(d_model, 64),
-            nn.LeakyReLU(0.01),
             nn.Dropout(dropout),
-            nn.Linear(64, num_classes)
-        )
-        # Additional feature engineering layer
-        # Processes derived features: OFI, VPIN, spread, depth imbalance
-        self.aux_features_dim = 6  # number of derived features
-        self.aux_encoder = nn.Sequential(
-            nn.Linear(self.aux_features_dim, 32),
-            nn.LeakyReLU(0.01),
-            nn.Linear(32, d_model),
-            nn.LeakyReLU(0.01),
-            nn.Dropout(dropout)
-        )
-        # Fusion layer
-        self.fusion = nn.Sequential(
-            nn.Linear(d_model * 2, d_model),
-            nn.LeakyReLU(0.01),
-            nn.Dropout(dropout)
         )
-        self._init_weights()
-    def _init_weights(self):
-        for m in self.modules():
-            if isinstance(m, (nn.Linear, nn.Conv1d, nn.Conv2d)):
-                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='leaky_relu')
-                if m.bias is not None:
-                    nn.init.constant_(m.bias, 0)
-            elif isinstance(m, (nn.BatchNorm1d, nn.BatchNorm2d, nn.LayerNorm)):
-                nn.init.constant_(m.weight, 1)
-                nn.init.constant_(m.bias, 0)
-    def compute_aux_features(self, x):
-        """
-        Compute derived microstructure features from raw LOB data.
-        x: (B, T, 40) raw LOB features
-        Returns: (B, 6) aggregated auxiliary features
-        """
-        B, T, F = x.shape
-        # Parse LOB structure: ask_p(10), ask_s(10), bid_p(10), bid_s(10)
-        ask_prices = x[:, :, 0:10]    # (B, T, 10)
-        ask_sizes = x[:, :, 10:20]    # (B, T, 10)
-        bid_prices = x[:, :, 20:30]   # (B, T, 10)
-        bid_sizes = x[:, :, 30:40]    # (B, T, 10)
-        # 1. Order Flow Imbalance (OFI) - key institutional signal
-        total_bid = ask_sizes.sum(dim=-1)  # (B, T)
-        total_ask = bid_sizes.sum(dim=-1)  # (B, T)
-        ofi = (total_bid - total_ask) / (total_bid + total_ask + 1e-8)
-        ofi_mean = ofi.mean(dim=1, keepdim=True)  # (B, 1)
-        # 2. Spread dynamics
-        spread = ask_prices[:, :, 0] - bid_prices[:, :, 0]  # (B, T)
-        spread_mean = spread.mean(dim=1, keepdim=True)
-        # 3. Depth imbalance at top levels (1-3)
-        top_bid = bid_sizes[:, :, :3].sum(dim=-1)  # (B, T)
-        top_ask = ask_sizes[:, :, :3].sum(dim=-1)  # (B, T)
-        depth_imb = (top_bid - top_ask) / (top_bid + top_ask + 1e-8)
-        depth_imb_mean = depth_imb.mean(dim=1, keepdim=True)
-        # 4. Volume concentration (institutional = concentrated at few levels)
-        bid_concentration = bid_sizes[:, :, 0] / (bid_sizes.sum(dim=-1) + 1e-8)
-        bid_conc_mean = bid_concentration.mean(dim=1, keepdim=True)
-        # 5. Price pressure (weighted volume by distance from mid)
-        mid_price = (ask_prices[:, :, 0] + bid_prices[:, :, 0]) / 2
-        bid_pressure = (bid_sizes * (mid_price.unsqueeze(-1) - bid_prices).abs()).sum(dim=-1)
-        ask_pressure = (ask_sizes * (ask_prices - mid_price.unsqueeze(-1)).abs()).sum(dim=-1)
-        pressure_ratio = (bid_pressure - ask_pressure) / (bid_pressure + ask_pressure + 1e-8)
-        pressure_mean = pressure_ratio.mean(dim=1, keepdim=True)
-        # 6. Temporal volatility of OFI (sudden changes = institutional activity)
-        ofi_vol = ofi.std(dim=1, keepdim=True)
-        return torch.cat([ofi_mean, spread_mean, depth_imb_mean,
-                         bid_conc_mean, pressure_mean, ofi_vol], dim=1)  # (B, 6)
     def forward(self, x):
-        """
-        x: (B, T, 40) - batch of LOB snapshot sequences
-        Returns: (B, num_classes) logits
-        """
-        B, T, F = x.shape
-        # Compute auxiliary features before normalization
-        aux_feats = self.compute_aux_features(x)  # (B, 6)
-        aux_encoded = self.aux_encoder(aux_feats)  # (B, d_model)
-        # 1. Bilinear normalization
-        x = self.norm(x)  # (B, T, 40)
-        # 2. Spatial CNN
-        x = x.unsqueeze(1)  # (B, 1, T, 40)
-        x = self.spatial_cnn(x)  # (B, 32, T, 1)
-        x = x.squeeze(-1)  # (B, 32, T)
-        # 3. Inception modules
-        x = self.inception1(x)  # (B, 128, T)
-        x = self.inception2(x)  # (B, 128, T)
-        # 4. Transpose and project for attention
-        x = x.permute(0, 2, 1)  # (B, T, 128)
-        x = self.proj(x)  # (B, T, d_model)
-        # 5. Temporal attention
-        for attn_layer in self.attention_layers:
-            x = attn_layer(x)
-        # Global average pooling
-        x = x.mean(dim=1)  # (B, d_model)
-        # 6. Fusion with auxiliary features
-        x = self.fusion(torch.cat([x, aux_encoded], dim=1))  # (B, d_model)
-        # 7. Classification
-        return self.classifier(x)  # (B, num_classes)
-    def get_attention_weights(self, x):
-        """Get attention weights for interpretability."""
-        B, T, F = x.shape
-        aux_feats = self.compute_aux_features(x)
         x = self.norm(x)
         x = x.unsqueeze(1)
-        x = self.spatial_cnn(x)
         x = x.squeeze(-1)
-        x = self.inception1(x)
-        x = self.inception2(x)
         x = x.permute(0, 2, 1)
-        x = self.proj(x)
-        weights = []
-        for attn_layer in self.attention_layers:
-            _, w = attn_layer.attn(x, x, x)
-            weights.append(w)
-            x = attn_layer(x)
-        return weights
-def count_parameters(model):
-    return sum(p.numel() for p in model.parameters() if p.requires_grad)
-if __name__ == "__main__":
-    # Test model
-    model = LOBPatternNet(seq_len=100, num_classes=3)
-    print(f"Total trainable parameters: {count_parameters(model):,}")
-    # Test forward pass
-    x = torch.randn(4, 100, 40)
-    out = model(x)
-    print(f"Input shape: {x.shape}")
-    print(f"Output shape: {out.shape}")
-    print(f"Output: {out}")

+"""LOBPatternNet V3 - for loading saved model weights."""
 import torch
 import torch.nn as nn
 class BilinearNorm(nn.Module):
     def __init__(self, num_features):
         super().__init__()
         self.gamma = nn.Parameter(torch.ones(1, 1, num_features))
         self.beta = nn.Parameter(torch.zeros(1, 1, num_features))
         self.gate = nn.Parameter(torch.ones(1, 1, num_features))
     def forward(self, x):
         mean = x.mean(dim=1, keepdim=True)
         std = x.std(dim=1, keepdim=True) + 1e-8
         x_norm = (x - mean) / std
         gate = torch.sigmoid(self.gate)
         return gate * (self.gamma * x_norm + self.beta) + (1 - gate) * x
+class LOBPatternNetV3(nn.Module):
+    def __init__(self, num_classes=3, d_model=64, nhead=4, dropout=0.4):
         super().__init__()
+        self.norm = BilinearNorm(40)
+        self.spatial = nn.Sequential(
+            nn.Conv2d(1, 16, kernel_size=(1, 2), stride=(1, 2)),
+            nn.BatchNorm2d(16), nn.ReLU(), nn.Dropout2d(dropout * 0.5),
+            nn.Conv2d(16, 16, kernel_size=(1, 2), stride=(1, 2)),
+            nn.BatchNorm2d(16), nn.ReLU(), nn.Dropout2d(dropout * 0.5),
+            nn.Conv2d(16, 16, kernel_size=(1, 10)),
+            nn.BatchNorm2d(16), nn.ReLU(),
+        )
+        self.temporal = nn.Sequential(
+            nn.Conv1d(16, 32, kernel_size=3, padding=1),
+            nn.BatchNorm1d(32), nn.ReLU(), nn.Dropout(dropout),
+            nn.Conv1d(32, 32, kernel_size=5, padding=2),
+            nn.BatchNorm1d(32), nn.ReLU(), nn.Dropout(dropout),
+            nn.Conv1d(32, d_model, kernel_size=3, padding=1),
+            nn.BatchNorm1d(d_model), nn.ReLU(), nn.Dropout(dropout),
+        )
+        encoder_layer = nn.TransformerEncoderLayer(
+            d_model=d_model, nhead=nhead, dim_feedforward=d_model*2,
+            dropout=dropout, batch_first=True, activation="gelu"
+        )
+        self.attention = nn.TransformerEncoder(encoder_layer, num_layers=2)
         self.classifier = nn.Sequential(
+            nn.LayerNorm(d_model),
             nn.Dropout(dropout),
+            nn.Linear(d_model, 32),
+            nn.GELU(),
+            nn.Dropout(dropout),
+            nn.Linear(32, num_classes)
         )
     def forward(self, x):
         x = self.norm(x)
         x = x.unsqueeze(1)
+        x = self.spatial(x)
         x = x.squeeze(-1)
+        x = self.temporal(x)
         x = x.permute(0, 2, 1)
+        x = self.attention(x)
+        x = x.mean(dim=1)
+        return self.classifier(x)

norm_stats.npz ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:758b1a926ffbca5b299e000f68a6c7b66b4f448ca61d280515ee7de71a398718
+size 824