| =============================================================================================== |
| Layer (type:depth-idx) Output Shape Param # |
| =============================================================================================== |
| FIMHawkes -- 256 |
| ├─SineTimeEncoding: 1-1 [6, 1, 100, 256] -- |
| │ └─Linear: 2-1 [6, 1, 100, 1] 2 |
| │ └─Sequential: 2-2 [6, 1, 100, 255] -- |
| │ │ └─Linear: 3-1 [6, 1, 100, 255] 510 |
| │ │ └─SinActivation: 3-2 [6, 1, 100, 255] -- |
| ├─SineTimeEncoding: 1-2 [6, 1, 100, 256] -- |
| │ └─Linear: 2-3 [6, 1, 100, 1] 2 |
| │ └─Sequential: 2-4 [6, 1, 100, 255] -- |
| │ │ └─Linear: 3-3 [6, 1, 100, 255] 510 |
| │ │ └─SinActivation: 3-4 [6, 1, 100, 255] -- |
| ├─Linear: 1-3 [600, 256] 5,888 |
| ├─LayerNorm: 1-4 [6, 1, 100, 256] 512 |
| ├─SineTimeEncoding: 1-5 [6, 1999, 100, 256] (recursive) |
| │ └─Linear: 2-5 [6, 1999, 100, 1] (recursive) |
| │ └─Sequential: 2-6 [6, 1999, 100, 255] (recursive) |
| │ │ └─Linear: 3-5 [6, 1999, 100, 255] (recursive) |
| │ │ └─SinActivation: 3-6 [6, 1999, 100, 255] -- |
| ├─SineTimeEncoding: 1-6 [6, 1999, 100, 256] (recursive) |
| │ └─Linear: 2-7 [6, 1999, 100, 1] (recursive) |
| │ └─Sequential: 2-8 [6, 1999, 100, 255] (recursive) |
| │ │ └─Linear: 3-7 [6, 1999, 100, 255] (recursive) |
| │ │ └─SinActivation: 3-8 [6, 1999, 100, 255] -- |
| ├─Linear: 1-7 [1199400, 256] (recursive) |
| ├─LayerNorm: 1-8 [6, 1999, 100, 256] (recursive) |
| ├─TransformerEncoder: 1-9 [11994, 100, 256] -- |
| │ └─ModuleList: 2-9 -- -- |
| │ │ └─TransformerEncoderLayer: 3-9 [11994, 100, 256] 1,315,072 |
| │ │ └─TransformerEncoderLayer: 3-10 [11994, 100, 256] 1,315,072 |
| │ │ └─TransformerEncoderLayer: 3-11 [11994, 100, 256] 1,315,072 |
| │ │ └─TransformerEncoderLayer: 3-12 [11994, 100, 256] 1,315,072 |
| ├─AttentionOperator: 1-10 [11994, 1, 256] -- |
| │ └─ModuleList: 2-10 -- -- |
| │ │ └─ResidualAttentionLayer: 3-13 [11994, 1, 256] 1,315,072 |
| ├─TransformerEncoder: 1-11 [6, 1999, 256] -- |
| │ └─ModuleList: 2-11 -- -- |
| │ │ └─TransformerEncoderLayer: 3-14 [6, 1999, 256] 1,315,072 |
| │ │ └─TransformerEncoderLayer: 3-15 [6, 1999, 256] 1,315,072 |
| ├─TransformerDecoder: 1-12 [6, 100, 256] -- |
| │ └─ModuleList: 2-12 -- -- |
| │ │ └─TransformerDecoderLayer: 3-16 [6, 100, 256] 1,578,752 |
| │ │ └─TransformerDecoderLayer: 3-17 [6, 100, 256] 1,578,752 |
| │ │ └─TransformerDecoderLayer: 3-18 [6, 100, 256] 1,578,752 |
| │ │ └─TransformerDecoderLayer: 3-19 [6, 100, 256] 1,578,752 |
| ├─Linear: 1-13 [1, 256] 5,888 |
| ├─MLP: 1-14 [600, 1] -- |
| │ └─Sequential: 2-13 [600, 1] -- |
| │ │ └─Linear: 3-20 [600, 256] 131,328 |
| │ │ └─GELU: 3-21 [600, 256] -- |
| │ │ └─Dropout: 3-22 [600, 256] -- |
| │ │ └─Linear: 3-23 [600, 256] 65,792 |
| │ │ └─GELU: 3-24 [600, 256] -- |
| │ │ └─Dropout: 3-25 [600, 256] -- |
| │ │ └─Linear: 3-26 [600, 1] 257 |
| ├─MLP: 1-15 [600, 1] -- |
| │ └─Sequential: 2-14 [600, 1] -- |
| │ │ └─Linear: 3-27 [600, 256] 131,328 |
| │ │ └─GELU: 3-28 [600, 256] -- |
| │ │ └─Dropout: 3-29 [600, 256] -- |
| │ │ └─Linear: 3-30 [600, 256] 65,792 |
| │ │ └─GELU: 3-31 [600, 256] -- |
| │ │ └─Dropout: 3-32 [600, 256] -- |
| │ │ └─Linear: 3-33 [600, 1] 257 |
| ├─MLP: 1-16 [600, 1] -- |
| │ └─Sequential: 2-15 [600, 1] -- |
| │ │ └─Linear: 3-34 [600, 256] 131,328 |
| │ │ └─GELU: 3-35 [600, 256] -- |
| │ │ └─Dropout: 3-36 [600, 256] -- |
| │ │ └─Linear: 3-37 [600, 256] 65,792 |
| │ │ └─GELU: 3-38 [600, 256] -- |
| │ │ └─Dropout: 3-39 [600, 256] -- |
| │ │ └─Linear: 3-40 [600, 1] 257 |
| =============================================================================================== |
| Total params: 16,126,211 |
| Trainable params: 16,126,211 |
| Non-trainable params: 0 |
| Total mult-adds (Units.GIGABYTES): 70.54 |
| =============================================================================================== |
| Input size (MB): 28.96 |
| Forward/backward pass size (MB): 118787.71 |
| Params size (MB): 48.71 |
| Estimated Total Size (MB): 118865.38 |
| =============================================================================================== |