| license: apache-2.0 | |
| tags: | |
| - video-classification | |
| - driver-behavior | |
| pipeline_tag: video-classification | |
| # SiftFormer 1.5 | |
| ## Classes | |
| | Index | Class | | |
| |-------|-------| | |
| | 0 | 정상 (normal) | | |
| | 1 | 졸음 (drowsy) | | |
| | 2 | 주의분산 (distracted) | | |
| | 3 | 폭행 (violence) | | |
| ## Input | |
| - Shape: `[B, T, C, H, W]` | |
| - `T = 30` frames | |
| - `H = W = 224` | |
| - `C = 3` (RGB) | |
| - ImageNet normalization (mean `[0.485, 0.456, 0.406]`, std `[0.229, 0.224, 0.225]`) | |
| - dtype: `float32` or `bfloat16` | |