Andrewsab commited on
Commit
4ebe334
·
verified ·
1 Parent(s): da8bee0

Voice Scribe mirror gigaam from Andrewsab/gigaam-v3-e2e-rnnt-ov@dff16933a640

Browse files
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language: ru
4
+ library_name: openvino
5
+ tags:
6
+ - speech-recognition
7
+ - russian
8
+ - openvino
9
+ - rnn-t
10
+ - conformer
11
+ - gigaam
12
+ base_model: ai-sage/GigaAM-v3
13
+ pipeline_tag: automatic-speech-recognition
14
+ ---
15
+
16
+ # GigaAM-v3 e2e_rnnt (OpenVINO IR, pre-converted)
17
+
18
+ OpenVINO IR port of [ai-sage/GigaAM-v3](https://huggingface.co/ai-sage/GigaAM-v3) revision `e2e_rnnt` — Sber's SOTA Russian ASR model (220M parameters, Conformer + RNN-T with end-to-end punctuation and capitalization).
19
+
20
+ Conversion done with:
21
+
22
+ ```python
23
+ from transformers import AutoModel
24
+ import torch
25
+ model = AutoModel.from_pretrained("ai-sage/GigaAM-v3", revision="e2e_rnnt", trust_remote_code=True)
26
+ model.to_onnx(dir_path="onnx", dtype=torch.float16)
27
+ # then:
28
+ import openvino as ov
29
+ for f in ["encoder", "decoder", "joint"]:
30
+ m = ov.convert_model(f"onnx/v3_e2e_rnnt_{f}.onnx")
31
+ ov.save_model(m, f"v3_e2e_rnnt_{f}.xml")
32
+ ```
33
+
34
+ ## Files
35
+
36
+ | File | Purpose | Size |
37
+ |------|---------|------|
38
+ | `v3_e2e_rnnt_encoder.xml/.bin` | Conformer encoder (main cost) | ~425 MB FP16 |
39
+ | `v3_e2e_rnnt_decoder.xml/.bin` | RNN-T decoder (prediction network) | ~2 MB |
40
+ | `v3_e2e_rnnt_joint.xml/.bin` | Joint network | ~1.3 MB |
41
+ | `tokenizer.model` | SentencePiece vocabulary (1024 subwords) | 250 KB |
42
+ | `config.json` | Original model config (for reference) | 2 KB |
43
+
44
+ ## Device compatibility (Intel hardware)
45
+
46
+ Verified on Intel Core Ultra 9 285H (OpenVINO 2025.4.1):
47
+
48
+ | Device | Encoder | Decoder | Joint | Usable? |
49
+ |--------|---------|---------|-------|---------|
50
+ | CPU | ✅ | ✅ | ✅ | Yes (~34× RTFx on 10 s chunk) |
51
+ | GPU.0 (Arc Xe2 iGPU) | ✅ | ✅ | ✅ | **Yes (~520× RTFx on encoder alone)** |
52
+ | NPU | ❌ (dynamic shapes) | ✅ | ❌ (dynamic shapes) | Partial only |
53
+
54
+ **Recommended device: Intel Arc iGPU (GPU.0)** — fastest and does not compete with NVIDIA for VRAM.
55
+
56
+ NPU fails compile on encoder/joint due to dynamic input shapes in the exported ONNX (upper bounds `9223372036854775807`). A re-export with static reshape at 10 s chunks would likely unlock NPU.
57
+
58
+ ## Usage (Python, pure OpenVINO)
59
+
60
+ ```python
61
+ import openvino as ov
62
+ core = ov.Core()
63
+ encoder = core.compile_model("v3_e2e_rnnt_encoder.xml", "GPU.0")
64
+ decoder = core.compile_model("v3_e2e_rnnt_decoder.xml", "GPU.0")
65
+ joint = core.compile_model("v3_e2e_rnnt_joint.xml", "GPU.0")
66
+
67
+ # Preprocess: audio 16 kHz mono -> log-mel (64 bins, 20 ms win, 10 ms hop)
68
+ # Encoder: features -> encoder outputs
69
+ # Decoder + Joint: RNN-T greedy decode loop -> token IDs
70
+ # SentencePieceProcessor(tokenizer.model).decode(ids) -> text
71
+ ```
72
+
73
+ A reference Python backend is available in the [Voice Scribe](https://github.com/andrewsabn/voice-scribe) project (MIT license).
74
+
75
+ ## Credits
76
+
77
+ - Original model: [Sber / ai-sage/GigaAM-v3](https://huggingface.co/ai-sage/GigaAM-v3) (MIT)
78
+ - OpenVINO conversion: [Voice Scribe project](https://github.com/andrewsabn/voice-scribe)
79
+
80
+ ## License
81
+
82
+ MIT (matches upstream ai-sage/GigaAM-v3).
UPSTREAM_SOURCE.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Voice Scribe Model Mirror
2
+
3
+ This repository is a Voice Scribe distribution mirror. The model artifacts are
4
+ copied from the upstream repository and the source revision below is pinned.
5
+
6
+ | Field | Value |
7
+ | --- | --- |
8
+ | Layout key | `gigaam` |
9
+ | Target directory in installer | `gigaam-v3-e2e-rnnt-ov` |
10
+ | Upstream repo | `Andrewsab/gigaam-v3-e2e-rnnt-ov` |
11
+ | Upstream revision | `dff16933a6407dbec85df5f3af8ed8d8a14d9e01` |
12
+ | Upstream resolved SHA | `dff16933a6407dbec85df5f3af8ed8d8a14d9e01` |
13
+ | Mirror created | `2026-04-23T22:39:28Z` |
14
+ | Description | GigaAM v3 e2e RNN-T Intel Arc/CPU OpenVINO layout. |
15
+ | License metadata | `{"license": "mit", "license_files": [], "license_tags": ["license:mit"]}` |
16
+
17
+ ## Installer Contract
18
+
19
+ This mirror corresponds to `parakeet/installer/wrapper/model_catalog.py`.
20
+ Required files for installer validation:
21
+
22
+ ```json
23
+ [
24
+ "config.json",
25
+ "tokenizer.model",
26
+ "v3_e2e_rnnt_encoder.xml",
27
+ "v3_e2e_rnnt_encoder.bin",
28
+ "v3_e2e_rnnt_decoder.xml",
29
+ "v3_e2e_rnnt_decoder.bin",
30
+ "v3_e2e_rnnt_joint.xml",
31
+ "v3_e2e_rnnt_joint.bin"
32
+ ]
33
+ ```
34
+
35
+ Allowed installer subset patterns:
36
+
37
+ ```json
38
+ []
39
+ ```
40
+
41
+ ## Redistribution Note
42
+
43
+ Do not make this repository public unless the upstream license and model card
44
+ allow redistribution for the intended use. Private mirrors are for operational
45
+ distribution convenience and reproducible installs.
config.json ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "gigaam",
3
+ "auto_map": {
4
+ "AutoConfig": "modeling_gigaam.GigaAMConfig",
5
+ "AutoModel": "modeling_gigaam.GigaAMModel"
6
+ },
7
+ "cfg": {
8
+ "model": {
9
+ "cfg": {
10
+ "model_class": "rnnt",
11
+ "sample_rate": 16000,
12
+ "preprocessor": {
13
+ "_target_": "modeling_gigaam.FeatureExtractor",
14
+ "sample_rate": 16000,
15
+ "features": 64,
16
+ "win_length": 320,
17
+ "hop_length": 160,
18
+ "mel_scale": "htk",
19
+ "n_fft": 320,
20
+ "mel_norm": null,
21
+ "center": false
22
+ },
23
+ "encoder": {
24
+ "_target_": "modeling_gigaam.ConformerEncoder",
25
+ "feat_in": 64,
26
+ "n_layers": 16,
27
+ "d_model": 768,
28
+ "subsampling_factor": 4,
29
+ "ff_expansion_factor": 4,
30
+ "self_attention_model": "rotary",
31
+ "pos_emb_max_len": 5000,
32
+ "n_heads": 16,
33
+ "conv_kernel_size": 5,
34
+ "flash_attn": false,
35
+ "subs_kernel_size": 5,
36
+ "subsampling": "conv1d",
37
+ "conv_norm_type": "layer_norm"
38
+ },
39
+ "head": {
40
+ "_target_": "modeling_gigaam.RNNTHead",
41
+ "decoder": {
42
+ "pred_hidden": 320,
43
+ "pred_rnn_layers": 1,
44
+ "num_classes": 1025
45
+ },
46
+ "joint": {
47
+ "enc_hidden": 768,
48
+ "pred_hidden": 320,
49
+ "joint_hidden": 320,
50
+ "num_classes": 1025
51
+ }
52
+ },
53
+ "decoding": {
54
+ "_target_": "modeling_gigaam.RNNTGreedyDecoding",
55
+ "vocabulary": null,
56
+ "model_path": "tokenizer.model"
57
+ },
58
+ "model_name": "v3_e2e_rnnt",
59
+ "hashes": {
60
+ "model": "72e2a9b5c7caad963b2bbfd2f298c252",
61
+ "tokenizer": "3b3bf8370e882885d79731592fc99f98"
62
+ }
63
+ },
64
+ "_target_": "modeling_gigaam.GigaAMASR"
65
+ }
66
+ }
67
+ }
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:828c12c991019eef952a960661f25a92d6ad279591e2ea466b4aeddf1d20a18a
3
+ size 255336
v3_e2e_rnnt_decoder.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c2ba6cc1e7e1263ed220a98151ea3f1c9423be776d8b3a63e59284112746a07
3
+ size 2297032
v3_e2e_rnnt_decoder.xml ADDED
@@ -0,0 +1,623 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <?xml version="1.0"?>
2
+ <net name="main_graph" version="11">
3
+ <layers>
4
+ <layer id="2" name="x" type="Parameter" version="opset1">
5
+ <data shape="1,1" element_type="i64" />
6
+ <rt_info>
7
+ <attribute name="old_api_map_element_type" version="0" value="i32" />
8
+ </rt_info>
9
+ <output>
10
+ <port id="0" precision="I64" names="x">
11
+ <dim>1</dim>
12
+ <dim>1</dim>
13
+ </port>
14
+ </output>
15
+ </layer>
16
+ <layer id="1" name="h.1" type="Parameter" version="opset1">
17
+ <data shape="1,1,320" element_type="f32" />
18
+ <output>
19
+ <port id="0" precision="FP32" names="h.1">
20
+ <dim>1</dim>
21
+ <dim>1</dim>
22
+ <dim>320</dim>
23
+ </port>
24
+ </output>
25
+ </layer>
26
+ <layer id="0" name="c.1" type="Parameter" version="opset1">
27
+ <data shape="1,1,320" element_type="f32" />
28
+ <output>
29
+ <port id="0" precision="FP32" names="c.1">
30
+ <dim>1</dim>
31
+ <dim>1</dim>
32
+ <dim>320</dim>
33
+ </port>
34
+ </output>
35
+ </layer>
36
+ <layer id="3" name="embed.weight_compressed" type="Const" version="opset1">
37
+ <data element_type="f16" shape="1025, 320" offset="0" size="656000" />
38
+ <output>
39
+ <port id="0" precision="FP16" names="embed.weight">
40
+ <dim>1025</dim>
41
+ <dim>320</dim>
42
+ </port>
43
+ </output>
44
+ </layer>
45
+ <layer id="4" name="embed.weight" type="Convert" version="opset1">
46
+ <data destination_type="f32" />
47
+ <rt_info>
48
+ <attribute name="decompression" version="0" />
49
+ </rt_info>
50
+ <input>
51
+ <port id="0" precision="FP16">
52
+ <dim>1025</dim>
53
+ <dim>320</dim>
54
+ </port>
55
+ </input>
56
+ <output>
57
+ <port id="1" precision="FP32">
58
+ <dim>1025</dim>
59
+ <dim>320</dim>
60
+ </port>
61
+ </output>
62
+ </layer>
63
+ <layer id="5" name="Constant_8" type="Const" version="opset1">
64
+ <data element_type="i64" shape="" offset="656000" size="8" />
65
+ <output>
66
+ <port id="0" precision="I64" />
67
+ </output>
68
+ </layer>
69
+ <layer id="6" name="/embed/Gather" type="Gather" version="opset8">
70
+ <data batch_dims="0" />
71
+ <input>
72
+ <port id="0" precision="FP32">
73
+ <dim>1025</dim>
74
+ <dim>320</dim>
75
+ </port>
76
+ <port id="1" precision="I64">
77
+ <dim>1</dim>
78
+ <dim>1</dim>
79
+ </port>
80
+ <port id="2" precision="I64" />
81
+ </input>
82
+ <output>
83
+ <port id="3" precision="FP32" names="/embed/Gather_output_0">
84
+ <dim>1</dim>
85
+ <dim>1</dim>
86
+ <dim>320</dim>
87
+ </port>
88
+ </output>
89
+ </layer>
90
+ <layer id="7" name="Constant_41" type="Const" version="opset1">
91
+ <data element_type="i64" shape="3" offset="656008" size="24" />
92
+ <output>
93
+ <port id="0" precision="I64">
94
+ <dim>3</dim>
95
+ </port>
96
+ </output>
97
+ </layer>
98
+ <layer id="8" name="Transpose_42" type="Transpose" version="opset1">
99
+ <input>
100
+ <port id="0" precision="FP32">
101
+ <dim>1</dim>
102
+ <dim>1</dim>
103
+ <dim>320</dim>
104
+ </port>
105
+ <port id="1" precision="I64">
106
+ <dim>3</dim>
107
+ </port>
108
+ </input>
109
+ <output>
110
+ <port id="2" precision="FP32">
111
+ <dim>1</dim>
112
+ <dim>1</dim>
113
+ <dim>320</dim>
114
+ </port>
115
+ </output>
116
+ </layer>
117
+ <layer id="9" name="Constant_43" type="Const" version="opset1">
118
+ <data element_type="i64" shape="3" offset="656008" size="24" />
119
+ <output>
120
+ <port id="0" precision="I64">
121
+ <dim>3</dim>
122
+ </port>
123
+ </output>
124
+ </layer>
125
+ <layer id="10" name="Transpose_44" type="Transpose" version="opset1">
126
+ <input>
127
+ <port id="0" precision="FP32">
128
+ <dim>1</dim>
129
+ <dim>1</dim>
130
+ <dim>320</dim>
131
+ </port>
132
+ <port id="1" precision="I64">
133
+ <dim>3</dim>
134
+ </port>
135
+ </input>
136
+ <output>
137
+ <port id="2" precision="FP32">
138
+ <dim>1</dim>
139
+ <dim>1</dim>
140
+ <dim>320</dim>
141
+ </port>
142
+ </output>
143
+ </layer>
144
+ <layer id="11" name="ShapeOf_2208" type="ShapeOf" version="opset3">
145
+ <data output_type="i64" />
146
+ <input>
147
+ <port id="0" precision="I64">
148
+ <dim>1</dim>
149
+ <dim>1</dim>
150
+ </port>
151
+ </input>
152
+ <output>
153
+ <port id="1" precision="I64">
154
+ <dim>2</dim>
155
+ </port>
156
+ </output>
157
+ </layer>
158
+ <layer id="12" name="ShapeOf_2207" type="ShapeOf" version="opset3">
159
+ <data output_type="i64" />
160
+ <input>
161
+ <port id="0" precision="FP32">
162
+ <dim>1025</dim>
163
+ <dim>320</dim>
164
+ </port>
165
+ </input>
166
+ <output>
167
+ <port id="1" precision="I64">
168
+ <dim>2</dim>
169
+ </port>
170
+ </output>
171
+ </layer>
172
+ <layer id="13" name="Constant_2209" type="Const" version="opset1">
173
+ <data element_type="i64" shape="1" offset="656032" size="8" />
174
+ <rt_info>
175
+ <attribute name="precise" version="0" />
176
+ </rt_info>
177
+ <output>
178
+ <port id="0" precision="I64">
179
+ <dim>1</dim>
180
+ </port>
181
+ </output>
182
+ </layer>
183
+ <layer id="14" name="Constant_2206" type="Const" version="opset1">
184
+ <data element_type="i64" shape="" offset="656000" size="8" />
185
+ <rt_info>
186
+ <attribute name="precise" version="0" />
187
+ </rt_info>
188
+ <output>
189
+ <port id="0" precision="I64" />
190
+ </output>
191
+ </layer>
192
+ <layer id="15" name="Gather_2210" type="Gather" version="opset1">
193
+ <input>
194
+ <port id="0" precision="I64">
195
+ <dim>2</dim>
196
+ </port>
197
+ <port id="1" precision="I64">
198
+ <dim>1</dim>
199
+ </port>
200
+ <port id="2" precision="I64" />
201
+ </input>
202
+ <output>
203
+ <port id="3" precision="I64">
204
+ <dim>1</dim>
205
+ </port>
206
+ </output>
207
+ </layer>
208
+ <layer id="16" name="ShapeOf_21" type="Concat" version="opset1">
209
+ <data axis="0" />
210
+ <input>
211
+ <port id="0" precision="I64">
212
+ <dim>2</dim>
213
+ </port>
214
+ <port id="1" precision="I64">
215
+ <dim>1</dim>
216
+ </port>
217
+ </input>
218
+ <output>
219
+ <port id="2" precision="I64">
220
+ <dim>3</dim>
221
+ </port>
222
+ </output>
223
+ </layer>
224
+ <layer id="17" name="Constant_25" type="Const" version="opset1">
225
+ <data element_type="i32" shape="1" offset="656040" size="4" />
226
+ <output>
227
+ <port id="0" precision="I32">
228
+ <dim>1</dim>
229
+ </port>
230
+ </output>
231
+ </layer>
232
+ <layer id="18" name="Constant_22" type="Const" version="opset1">
233
+ <data element_type="i32" shape="1" offset="656044" size="4" />
234
+ <output>
235
+ <port id="0" precision="I32">
236
+ <dim>1</dim>
237
+ </port>
238
+ </output>
239
+ </layer>
240
+ <layer id="19" name="Gather_26" type="Gather" version="opset8">
241
+ <data batch_dims="0" />
242
+ <input>
243
+ <port id="0" precision="I64">
244
+ <dim>3</dim>
245
+ </port>
246
+ <port id="1" precision="I32">
247
+ <dim>1</dim>
248
+ </port>
249
+ <port id="2" precision="I32">
250
+ <dim>1</dim>
251
+ </port>
252
+ </input>
253
+ <output>
254
+ <port id="3" precision="I64">
255
+ <dim>1</dim>
256
+ </port>
257
+ </output>
258
+ </layer>
259
+ <layer id="20" name="Constant_2100" type="Const" version="opset1">
260
+ <data element_type="i64" shape="1" offset="656000" size="8" />
261
+ <rt_info>
262
+ <attribute name="precise" version="0" />
263
+ </rt_info>
264
+ <output>
265
+ <port id="0" precision="I64">
266
+ <dim>1</dim>
267
+ </port>
268
+ </output>
269
+ </layer>
270
+ <layer id="21" name="Constant_2101" type="Const" version="opset1">
271
+ <data element_type="i64" shape="" offset="656000" size="8" />
272
+ <rt_info>
273
+ <attribute name="precise" version="0" />
274
+ </rt_info>
275
+ <output>
276
+ <port id="0" precision="I64" />
277
+ </output>
278
+ </layer>
279
+ <layer id="22" name="Gather_2102" type="Gather" version="opset8">
280
+ <data batch_dims="0" />
281
+ <input>
282
+ <port id="0" precision="I64">
283
+ <dim>3</dim>
284
+ </port>
285
+ <port id="1" precision="I64">
286
+ <dim>1</dim>
287
+ </port>
288
+ <port id="2" precision="I64" />
289
+ </input>
290
+ <output>
291
+ <port id="3" precision="I64">
292
+ <dim>1</dim>
293
+ </port>
294
+ </output>
295
+ </layer>
296
+ <layer id="23" name="Broadcast_38" type="Broadcast" version="opset3">
297
+ <data mode="numpy" />
298
+ <input>
299
+ <port id="0" precision="I64">
300
+ <dim>1</dim>
301
+ </port>
302
+ <port id="1" precision="I64">
303
+ <dim>1</dim>
304
+ </port>
305
+ </input>
306
+ <output>
307
+ <port id="2" precision="I64">
308
+ <dim>1</dim>
309
+ </port>
310
+ </output>
311
+ </layer>
312
+ <layer id="24" name="Concat_17_compressed" type="Const" version="opset1">
313
+ <data element_type="f16" shape="1, 1280, 320" offset="656048" size="819200" />
314
+ <output>
315
+ <port id="0" precision="FP16">
316
+ <dim>1</dim>
317
+ <dim>1280</dim>
318
+ <dim>320</dim>
319
+ </port>
320
+ </output>
321
+ </layer>
322
+ <layer id="25" name="Concat_17" type="Convert" version="opset1">
323
+ <data destination_type="f32" />
324
+ <rt_info>
325
+ <attribute name="decompression" version="0" />
326
+ </rt_info>
327
+ <input>
328
+ <port id="0" precision="FP16">
329
+ <dim>1</dim>
330
+ <dim>1280</dim>
331
+ <dim>320</dim>
332
+ </port>
333
+ </input>
334
+ <output>
335
+ <port id="1" precision="FP32">
336
+ <dim>1</dim>
337
+ <dim>1280</dim>
338
+ <dim>320</dim>
339
+ </port>
340
+ </output>
341
+ </layer>
342
+ <layer id="26" name="Concat_20_compressed" type="Const" version="opset1">
343
+ <data element_type="f16" shape="1, 1280, 320" offset="1475248" size="819200" />
344
+ <output>
345
+ <port id="0" precision="FP16">
346
+ <dim>1</dim>
347
+ <dim>1280</dim>
348
+ <dim>320</dim>
349
+ </port>
350
+ </output>
351
+ </layer>
352
+ <layer id="27" name="Concat_20" type="Convert" version="opset1">
353
+ <data destination_type="f32" />
354
+ <rt_info>
355
+ <attribute name="decompression" version="0" />
356
+ </rt_info>
357
+ <input>
358
+ <port id="0" precision="FP16">
359
+ <dim>1</dim>
360
+ <dim>1280</dim>
361
+ <dim>320</dim>
362
+ </port>
363
+ </input>
364
+ <output>
365
+ <port id="1" precision="FP32">
366
+ <dim>1</dim>
367
+ <dim>1280</dim>
368
+ <dim>320</dim>
369
+ </port>
370
+ </output>
371
+ </layer>
372
+ <layer id="28" name="Concat_37_compressed" type="Const" version="opset1">
373
+ <data element_type="f16" shape="1, 1280" offset="2294448" size="2560" />
374
+ <output>
375
+ <port id="0" precision="FP16">
376
+ <dim>1</dim>
377
+ <dim>1280</dim>
378
+ </port>
379
+ </output>
380
+ </layer>
381
+ <layer id="29" name="Concat_37" type="Convert" version="opset1">
382
+ <data destination_type="f32" />
383
+ <rt_info>
384
+ <attribute name="decompression" version="0" />
385
+ </rt_info>
386
+ <input>
387
+ <port id="0" precision="FP16">
388
+ <dim>1</dim>
389
+ <dim>1280</dim>
390
+ </port>
391
+ </input>
392
+ <output>
393
+ <port id="1" precision="FP32">
394
+ <dim>1</dim>
395
+ <dim>1280</dim>
396
+ </port>
397
+ </output>
398
+ </layer>
399
+ <layer id="30" name="LSTMSequence_52" type="LSTMSequence" version="opset5">
400
+ <data direction="forward" hidden_size="320" activations="sigmoid, tanh, tanh" activations_alpha="" activations_beta="" clip="0" />
401
+ <input>
402
+ <port id="0" precision="FP32">
403
+ <dim>1</dim>
404
+ <dim>1</dim>
405
+ <dim>320</dim>
406
+ </port>
407
+ <port id="1" precision="FP32">
408
+ <dim>1</dim>
409
+ <dim>1</dim>
410
+ <dim>320</dim>
411
+ </port>
412
+ <port id="2" precision="FP32">
413
+ <dim>1</dim>
414
+ <dim>1</dim>
415
+ <dim>320</dim>
416
+ </port>
417
+ <port id="3" precision="I64">
418
+ <dim>1</dim>
419
+ </port>
420
+ <port id="4" precision="FP32">
421
+ <dim>1</dim>
422
+ <dim>1280</dim>
423
+ <dim>320</dim>
424
+ </port>
425
+ <port id="5" precision="FP32">
426
+ <dim>1</dim>
427
+ <dim>1280</dim>
428
+ <dim>320</dim>
429
+ </port>
430
+ <port id="6" precision="FP32">
431
+ <dim>1</dim>
432
+ <dim>1280</dim>
433
+ </port>
434
+ </input>
435
+ <output>
436
+ <port id="7" precision="FP32">
437
+ <dim>1</dim>
438
+ <dim>1</dim>
439
+ <dim>1</dim>
440
+ <dim>320</dim>
441
+ </port>
442
+ <port id="8" precision="FP32">
443
+ <dim>1</dim>
444
+ <dim>1</dim>
445
+ <dim>320</dim>
446
+ </port>
447
+ <port id="9" precision="FP32">
448
+ <dim>1</dim>
449
+ <dim>1</dim>
450
+ <dim>320</dim>
451
+ </port>
452
+ </output>
453
+ </layer>
454
+ <layer id="31" name="Constant_1949" type="Const" version="opset1">
455
+ <data element_type="i64" shape="3" offset="2297008" size="24" />
456
+ <rt_info>
457
+ <attribute name="precise" version="0" />
458
+ </rt_info>
459
+ <output>
460
+ <port id="0" precision="I64">
461
+ <dim>3</dim>
462
+ </port>
463
+ </output>
464
+ </layer>
465
+ <layer id="32" name="c" type="Reshape" version="opset1">
466
+ <data special_zero="true" />
467
+ <input>
468
+ <port id="0" precision="FP32">
469
+ <dim>1</dim>
470
+ <dim>1</dim>
471
+ <dim>320</dim>
472
+ </port>
473
+ <port id="1" precision="I64">
474
+ <dim>3</dim>
475
+ </port>
476
+ </input>
477
+ <output>
478
+ <port id="2" precision="FP32" names="c">
479
+ <dim>1</dim>
480
+ <dim>1</dim>
481
+ <dim>320</dim>
482
+ </port>
483
+ </output>
484
+ </layer>
485
+ <layer id="34" name="Constant_1951" type="Const" version="opset1">
486
+ <data element_type="i64" shape="3" offset="2297008" size="24" />
487
+ <rt_info>
488
+ <attribute name="precise" version="0" />
489
+ </rt_info>
490
+ <output>
491
+ <port id="0" precision="I64">
492
+ <dim>3</dim>
493
+ </port>
494
+ </output>
495
+ </layer>
496
+ <layer id="35" name="h" type="Reshape" version="opset1">
497
+ <data special_zero="true" />
498
+ <input>
499
+ <port id="0" precision="FP32">
500
+ <dim>1</dim>
501
+ <dim>1</dim>
502
+ <dim>320</dim>
503
+ </port>
504
+ <port id="1" precision="I64">
505
+ <dim>3</dim>
506
+ </port>
507
+ </input>
508
+ <output>
509
+ <port id="2" precision="FP32" names="h">
510
+ <dim>1</dim>
511
+ <dim>1</dim>
512
+ <dim>320</dim>
513
+ </port>
514
+ </output>
515
+ </layer>
516
+ <layer id="37" name="Constant_1942" type="Const" version="opset1">
517
+ <data element_type="i64" shape="1" offset="656032" size="8" />
518
+ <output>
519
+ <port id="0" precision="I64">
520
+ <dim>1</dim>
521
+ </port>
522
+ </output>
523
+ </layer>
524
+ <layer id="38" name="dec" type="Squeeze" version="opset1">
525
+ <input>
526
+ <port id="0" precision="FP32">
527
+ <dim>1</dim>
528
+ <dim>1</dim>
529
+ <dim>1</dim>
530
+ <dim>320</dim>
531
+ </port>
532
+ <port id="1" precision="I64">
533
+ <dim>1</dim>
534
+ </port>
535
+ </input>
536
+ <output>
537
+ <port id="2" precision="FP32" names="dec">
538
+ <dim>1</dim>
539
+ <dim>1</dim>
540
+ <dim>320</dim>
541
+ </port>
542
+ </output>
543
+ </layer>
544
+ <layer id="39" name="dec/sink_port_0" type="Result" version="opset1" output_names="dec">
545
+ <input>
546
+ <port id="0" precision="FP32">
547
+ <dim>1</dim>
548
+ <dim>1</dim>
549
+ <dim>320</dim>
550
+ </port>
551
+ </input>
552
+ </layer>
553
+ <layer id="36" name="h/sink_port_0" type="Result" version="opset1" output_names="h">
554
+ <input>
555
+ <port id="0" precision="FP32">
556
+ <dim>1</dim>
557
+ <dim>1</dim>
558
+ <dim>320</dim>
559
+ </port>
560
+ </input>
561
+ </layer>
562
+ <layer id="33" name="c/sink_port_0" type="Result" version="opset1" output_names="c">
563
+ <input>
564
+ <port id="0" precision="FP32">
565
+ <dim>1</dim>
566
+ <dim>1</dim>
567
+ <dim>320</dim>
568
+ </port>
569
+ </input>
570
+ </layer>
571
+ </layers>
572
+ <edges>
573
+ <edge from-layer="0" from-port="0" to-layer="10" to-port="0" />
574
+ <edge from-layer="1" from-port="0" to-layer="8" to-port="0" />
575
+ <edge from-layer="2" from-port="0" to-layer="6" to-port="1" />
576
+ <edge from-layer="2" from-port="0" to-layer="11" to-port="0" />
577
+ <edge from-layer="3" from-port="0" to-layer="4" to-port="0" />
578
+ <edge from-layer="4" from-port="1" to-layer="6" to-port="0" />
579
+ <edge from-layer="4" from-port="1" to-layer="12" to-port="0" />
580
+ <edge from-layer="5" from-port="0" to-layer="6" to-port="2" />
581
+ <edge from-layer="6" from-port="3" to-layer="30" to-port="0" />
582
+ <edge from-layer="7" from-port="0" to-layer="8" to-port="1" />
583
+ <edge from-layer="8" from-port="2" to-layer="30" to-port="1" />
584
+ <edge from-layer="9" from-port="0" to-layer="10" to-port="1" />
585
+ <edge from-layer="10" from-port="2" to-layer="30" to-port="2" />
586
+ <edge from-layer="11" from-port="1" to-layer="16" to-port="0" />
587
+ <edge from-layer="12" from-port="1" to-layer="15" to-port="0" />
588
+ <edge from-layer="13" from-port="0" to-layer="15" to-port="1" />
589
+ <edge from-layer="14" from-port="0" to-layer="15" to-port="2" />
590
+ <edge from-layer="15" from-port="3" to-layer="16" to-port="1" />
591
+ <edge from-layer="16" from-port="2" to-layer="19" to-port="0" />
592
+ <edge from-layer="16" from-port="2" to-layer="22" to-port="0" />
593
+ <edge from-layer="17" from-port="0" to-layer="19" to-port="1" />
594
+ <edge from-layer="18" from-port="0" to-layer="19" to-port="2" />
595
+ <edge from-layer="19" from-port="3" to-layer="23" to-port="0" />
596
+ <edge from-layer="20" from-port="0" to-layer="22" to-port="1" />
597
+ <edge from-layer="21" from-port="0" to-layer="22" to-port="2" />
598
+ <edge from-layer="22" from-port="3" to-layer="23" to-port="1" />
599
+ <edge from-layer="23" from-port="2" to-layer="30" to-port="3" />
600
+ <edge from-layer="24" from-port="0" to-layer="25" to-port="0" />
601
+ <edge from-layer="25" from-port="1" to-layer="30" to-port="4" />
602
+ <edge from-layer="26" from-port="0" to-layer="27" to-port="0" />
603
+ <edge from-layer="27" from-port="1" to-layer="30" to-port="5" />
604
+ <edge from-layer="28" from-port="0" to-layer="29" to-port="0" />
605
+ <edge from-layer="29" from-port="1" to-layer="30" to-port="6" />
606
+ <edge from-layer="30" from-port="9" to-layer="32" to-port="0" />
607
+ <edge from-layer="30" from-port="8" to-layer="35" to-port="0" />
608
+ <edge from-layer="30" from-port="7" to-layer="38" to-port="0" />
609
+ <edge from-layer="31" from-port="0" to-layer="32" to-port="1" />
610
+ <edge from-layer="32" from-port="2" to-layer="33" to-port="0" />
611
+ <edge from-layer="34" from-port="0" to-layer="35" to-port="1" />
612
+ <edge from-layer="35" from-port="2" to-layer="36" to-port="0" />
613
+ <edge from-layer="37" from-port="0" to-layer="38" to-port="1" />
614
+ <edge from-layer="38" from-port="2" to-layer="39" to-port="0" />
615
+ </edges>
616
+ <rt_info>
617
+ <Runtime_version value="2025.4.1-20426-82bbf0292c5-releases/2025/4" />
618
+ <conversion_parameters>
619
+ <input_model value="DIR\v3_e2e_rnnt_decoder.onnx" />
620
+ <is_python_object value="False" />
621
+ </conversion_parameters>
622
+ </rt_info>
623
+ </net>
v3_e2e_rnnt_encoder.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1611ee60bb00ab6ff2323279b5943754e20ef9b324c9ed7e92cc83924838894
3
+ size 442323770
v3_e2e_rnnt_encoder.xml ADDED
The diff for this file is too large to render. See raw diff
 
v3_e2e_rnnt_joint.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2249d6b447c773706f6a1211637db0646662711f9ce19738510c006a01b10dc4
3
+ size 1355666
v3_e2e_rnnt_joint.xml ADDED
@@ -0,0 +1,497 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <?xml version="1.0"?>
2
+ <net name="main_graph" version="11">
3
+ <layers>
4
+ <layer id="1" name="enc" type="Parameter" version="opset1">
5
+ <data shape="1,768,1" element_type="f32" />
6
+ <output>
7
+ <port id="0" precision="FP32" names="enc">
8
+ <dim>1</dim>
9
+ <dim>768</dim>
10
+ <dim>1</dim>
11
+ </port>
12
+ </output>
13
+ </layer>
14
+ <layer id="0" name="dec" type="Parameter" version="opset1">
15
+ <data shape="1,320,1" element_type="f32" />
16
+ <output>
17
+ <port id="0" precision="FP32" names="dec">
18
+ <dim>1</dim>
19
+ <dim>320</dim>
20
+ <dim>1</dim>
21
+ </port>
22
+ </output>
23
+ </layer>
24
+ <layer id="2" name="Constant_40626_compressed" type="Const" version="opset1">
25
+ <data element_type="f16" shape="1, 1, 1, 1025" offset="0" size="2050" />
26
+ <output>
27
+ <port id="0" precision="FP16">
28
+ <dim>1</dim>
29
+ <dim>1</dim>
30
+ <dim>1</dim>
31
+ <dim>1025</dim>
32
+ </port>
33
+ </output>
34
+ </layer>
35
+ <layer id="3" name="Constant_40626" type="Convert" version="opset1">
36
+ <data destination_type="f32" />
37
+ <rt_info>
38
+ <attribute name="decompression" version="0" />
39
+ </rt_info>
40
+ <input>
41
+ <port id="0" precision="FP16">
42
+ <dim>1</dim>
43
+ <dim>1</dim>
44
+ <dim>1</dim>
45
+ <dim>1025</dim>
46
+ </port>
47
+ </input>
48
+ <output>
49
+ <port id="1" precision="FP32">
50
+ <dim>1</dim>
51
+ <dim>1</dim>
52
+ <dim>1</dim>
53
+ <dim>1025</dim>
54
+ </port>
55
+ </output>
56
+ </layer>
57
+ <layer id="4" name="Constant_40624_compressed" type="Const" version="opset1">
58
+ <data element_type="f16" shape="1, 1, 320" offset="2050" size="640" />
59
+ <output>
60
+ <port id="0" precision="FP16">
61
+ <dim>1</dim>
62
+ <dim>1</dim>
63
+ <dim>320</dim>
64
+ </port>
65
+ </output>
66
+ </layer>
67
+ <layer id="5" name="Constant_40624" type="Convert" version="opset1">
68
+ <data destination_type="f32" />
69
+ <rt_info>
70
+ <attribute name="decompression" version="0" />
71
+ </rt_info>
72
+ <input>
73
+ <port id="0" precision="FP16">
74
+ <dim>1</dim>
75
+ <dim>1</dim>
76
+ <dim>320</dim>
77
+ </port>
78
+ </input>
79
+ <output>
80
+ <port id="1" precision="FP32">
81
+ <dim>1</dim>
82
+ <dim>1</dim>
83
+ <dim>320</dim>
84
+ </port>
85
+ </output>
86
+ </layer>
87
+ <layer id="6" name="Transpose_40471_compressed" type="Const" version="opset1">
88
+ <data element_type="f16" shape="320, 768" offset="2690" size="491520" />
89
+ <output>
90
+ <port id="0" precision="FP16">
91
+ <dim>320</dim>
92
+ <dim>768</dim>
93
+ </port>
94
+ </output>
95
+ </layer>
96
+ <layer id="7" name="Transpose_40471" type="Convert" version="opset1">
97
+ <data destination_type="f32" />
98
+ <rt_info>
99
+ <attribute name="decompression" version="0" />
100
+ </rt_info>
101
+ <input>
102
+ <port id="0" precision="FP16">
103
+ <dim>320</dim>
104
+ <dim>768</dim>
105
+ </port>
106
+ </input>
107
+ <output>
108
+ <port id="1" precision="FP32">
109
+ <dim>320</dim>
110
+ <dim>768</dim>
111
+ </port>
112
+ </output>
113
+ </layer>
114
+ <layer id="8" name="/enc/MatMul" type="MatMul" version="opset1">
115
+ <data transpose_a="true" transpose_b="true" />
116
+ <input>
117
+ <port id="0" precision="FP32">
118
+ <dim>1</dim>
119
+ <dim>768</dim>
120
+ <dim>1</dim>
121
+ </port>
122
+ <port id="1" precision="FP32">
123
+ <dim>320</dim>
124
+ <dim>768</dim>
125
+ </port>
126
+ </input>
127
+ <output>
128
+ <port id="2" precision="FP32" names="/enc/MatMul_output_0">
129
+ <dim>1</dim>
130
+ <dim>1</dim>
131
+ <dim>320</dim>
132
+ </port>
133
+ </output>
134
+ </layer>
135
+ <layer id="9" name="/enc/Add" type="Add" version="opset1">
136
+ <data auto_broadcast="numpy" />
137
+ <input>
138
+ <port id="0" precision="FP32">
139
+ <dim>1</dim>
140
+ <dim>1</dim>
141
+ <dim>320</dim>
142
+ </port>
143
+ <port id="1" precision="FP32">
144
+ <dim>1</dim>
145
+ <dim>1</dim>
146
+ <dim>320</dim>
147
+ </port>
148
+ </input>
149
+ <output>
150
+ <port id="2" precision="FP32" names="/enc/Add_output_0">
151
+ <dim>1</dim>
152
+ <dim>1</dim>
153
+ <dim>320</dim>
154
+ </port>
155
+ </output>
156
+ </layer>
157
+ <layer id="10" name="/Constant" type="Const" version="opset1">
158
+ <data element_type="i64" shape="1" offset="494210" size="8" />
159
+ <output>
160
+ <port id="0" precision="I64" names="/Constant_output_0">
161
+ <dim>1</dim>
162
+ </port>
163
+ </output>
164
+ </layer>
165
+ <layer id="11" name="/Unsqueeze" type="Unsqueeze" version="opset1">
166
+ <input>
167
+ <port id="0" precision="FP32">
168
+ <dim>1</dim>
169
+ <dim>1</dim>
170
+ <dim>320</dim>
171
+ </port>
172
+ <port id="1" precision="I64">
173
+ <dim>1</dim>
174
+ </port>
175
+ </input>
176
+ <output>
177
+ <port id="2" precision="FP32" names="/Unsqueeze_output_0">
178
+ <dim>1</dim>
179
+ <dim>1</dim>
180
+ <dim>1</dim>
181
+ <dim>320</dim>
182
+ </port>
183
+ </output>
184
+ </layer>
185
+ <layer id="12" name="Constant_40625_compressed" type="Const" version="opset1">
186
+ <data element_type="f16" shape="1, 1, 320" offset="494218" size="640" />
187
+ <output>
188
+ <port id="0" precision="FP16">
189
+ <dim>1</dim>
190
+ <dim>1</dim>
191
+ <dim>320</dim>
192
+ </port>
193
+ </output>
194
+ </layer>
195
+ <layer id="13" name="Constant_40625" type="Convert" version="opset1">
196
+ <data destination_type="f32" />
197
+ <rt_info>
198
+ <attribute name="decompression" version="0" />
199
+ </rt_info>
200
+ <input>
201
+ <port id="0" precision="FP16">
202
+ <dim>1</dim>
203
+ <dim>1</dim>
204
+ <dim>320</dim>
205
+ </port>
206
+ </input>
207
+ <output>
208
+ <port id="1" precision="FP32">
209
+ <dim>1</dim>
210
+ <dim>1</dim>
211
+ <dim>320</dim>
212
+ </port>
213
+ </output>
214
+ </layer>
215
+ <layer id="14" name="Transpose_40478_compressed" type="Const" version="opset1">
216
+ <data element_type="f16" shape="320, 320" offset="494858" size="204800" />
217
+ <output>
218
+ <port id="0" precision="FP16">
219
+ <dim>320</dim>
220
+ <dim>320</dim>
221
+ </port>
222
+ </output>
223
+ </layer>
224
+ <layer id="15" name="Transpose_40478" type="Convert" version="opset1">
225
+ <data destination_type="f32" />
226
+ <rt_info>
227
+ <attribute name="decompression" version="0" />
228
+ </rt_info>
229
+ <input>
230
+ <port id="0" precision="FP16">
231
+ <dim>320</dim>
232
+ <dim>320</dim>
233
+ </port>
234
+ </input>
235
+ <output>
236
+ <port id="1" precision="FP32">
237
+ <dim>320</dim>
238
+ <dim>320</dim>
239
+ </port>
240
+ </output>
241
+ </layer>
242
+ <layer id="16" name="/pred/MatMul" type="MatMul" version="opset1">
243
+ <data transpose_a="true" transpose_b="true" />
244
+ <input>
245
+ <port id="0" precision="FP32">
246
+ <dim>1</dim>
247
+ <dim>320</dim>
248
+ <dim>1</dim>
249
+ </port>
250
+ <port id="1" precision="FP32">
251
+ <dim>320</dim>
252
+ <dim>320</dim>
253
+ </port>
254
+ </input>
255
+ <output>
256
+ <port id="2" precision="FP32" names="/pred/MatMul_output_0">
257
+ <dim>1</dim>
258
+ <dim>1</dim>
259
+ <dim>320</dim>
260
+ </port>
261
+ </output>
262
+ </layer>
263
+ <layer id="17" name="/pred/Add" type="Add" version="opset1">
264
+ <data auto_broadcast="numpy" />
265
+ <input>
266
+ <port id="0" precision="FP32">
267
+ <dim>1</dim>
268
+ <dim>1</dim>
269
+ <dim>320</dim>
270
+ </port>
271
+ <port id="1" precision="FP32">
272
+ <dim>1</dim>
273
+ <dim>1</dim>
274
+ <dim>320</dim>
275
+ </port>
276
+ </input>
277
+ <output>
278
+ <port id="2" precision="FP32" names="/pred/Add_output_0">
279
+ <dim>1</dim>
280
+ <dim>1</dim>
281
+ <dim>320</dim>
282
+ </port>
283
+ </output>
284
+ </layer>
285
+ <layer id="18" name="/Constant_1" type="Const" version="opset1">
286
+ <data element_type="i64" shape="1" offset="699658" size="8" />
287
+ <output>
288
+ <port id="0" precision="I64" names="/Constant_1_output_0">
289
+ <dim>1</dim>
290
+ </port>
291
+ </output>
292
+ </layer>
293
+ <layer id="19" name="/Unsqueeze_1" type="Unsqueeze" version="opset1">
294
+ <input>
295
+ <port id="0" precision="FP32">
296
+ <dim>1</dim>
297
+ <dim>1</dim>
298
+ <dim>320</dim>
299
+ </port>
300
+ <port id="1" precision="I64">
301
+ <dim>1</dim>
302
+ </port>
303
+ </input>
304
+ <output>
305
+ <port id="2" precision="FP32" names="/Unsqueeze_1_output_0">
306
+ <dim>1</dim>
307
+ <dim>1</dim>
308
+ <dim>1</dim>
309
+ <dim>320</dim>
310
+ </port>
311
+ </output>
312
+ </layer>
313
+ <layer id="20" name="/Add" type="Add" version="opset1">
314
+ <data auto_broadcast="numpy" />
315
+ <input>
316
+ <port id="0" precision="FP32">
317
+ <dim>1</dim>
318
+ <dim>1</dim>
319
+ <dim>1</dim>
320
+ <dim>320</dim>
321
+ </port>
322
+ <port id="1" precision="FP32">
323
+ <dim>1</dim>
324
+ <dim>1</dim>
325
+ <dim>1</dim>
326
+ <dim>320</dim>
327
+ </port>
328
+ </input>
329
+ <output>
330
+ <port id="2" precision="FP32" names="/Add_output_0">
331
+ <dim>1</dim>
332
+ <dim>1</dim>
333
+ <dim>1</dim>
334
+ <dim>320</dim>
335
+ </port>
336
+ </output>
337
+ </layer>
338
+ <layer id="21" name="/joint_net/joint_net.0/Relu" type="ReLU" version="opset1">
339
+ <input>
340
+ <port id="0" precision="FP32">
341
+ <dim>1</dim>
342
+ <dim>1</dim>
343
+ <dim>1</dim>
344
+ <dim>320</dim>
345
+ </port>
346
+ </input>
347
+ <output>
348
+ <port id="1" precision="FP32" names="/joint_net/joint_net.0/Relu_output_0">
349
+ <dim>1</dim>
350
+ <dim>1</dim>
351
+ <dim>1</dim>
352
+ <dim>320</dim>
353
+ </port>
354
+ </output>
355
+ </layer>
356
+ <layer id="22" name="Transpose_40487_compressed" type="Const" version="opset1">
357
+ <data element_type="f16" shape="1025, 320" offset="699666" size="656000" />
358
+ <output>
359
+ <port id="0" precision="FP16">
360
+ <dim>1025</dim>
361
+ <dim>320</dim>
362
+ </port>
363
+ </output>
364
+ </layer>
365
+ <layer id="23" name="Transpose_40487" type="Convert" version="opset1">
366
+ <data destination_type="f32" />
367
+ <rt_info>
368
+ <attribute name="decompression" version="0" />
369
+ </rt_info>
370
+ <input>
371
+ <port id="0" precision="FP16">
372
+ <dim>1025</dim>
373
+ <dim>320</dim>
374
+ </port>
375
+ </input>
376
+ <output>
377
+ <port id="1" precision="FP32">
378
+ <dim>1025</dim>
379
+ <dim>320</dim>
380
+ </port>
381
+ </output>
382
+ </layer>
383
+ <layer id="24" name="/joint_net/joint_net.1/MatMul" type="MatMul" version="opset1">
384
+ <data transpose_a="false" transpose_b="true" />
385
+ <input>
386
+ <port id="0" precision="FP32">
387
+ <dim>1</dim>
388
+ <dim>1</dim>
389
+ <dim>1</dim>
390
+ <dim>320</dim>
391
+ </port>
392
+ <port id="1" precision="FP32">
393
+ <dim>1025</dim>
394
+ <dim>320</dim>
395
+ </port>
396
+ </input>
397
+ <output>
398
+ <port id="2" precision="FP32" names="/joint_net/joint_net.1/MatMul_output_0">
399
+ <dim>1</dim>
400
+ <dim>1</dim>
401
+ <dim>1</dim>
402
+ <dim>1025</dim>
403
+ </port>
404
+ </output>
405
+ </layer>
406
+ <layer id="25" name="/joint_net/joint_net.1/Add" type="Add" version="opset1">
407
+ <data auto_broadcast="numpy" />
408
+ <input>
409
+ <port id="0" precision="FP32">
410
+ <dim>1</dim>
411
+ <dim>1</dim>
412
+ <dim>1</dim>
413
+ <dim>1025</dim>
414
+ </port>
415
+ <port id="1" precision="FP32">
416
+ <dim>1</dim>
417
+ <dim>1</dim>
418
+ <dim>1</dim>
419
+ <dim>1025</dim>
420
+ </port>
421
+ </input>
422
+ <output>
423
+ <port id="2" precision="FP32" names="/joint_net/joint_net.1/Add_output_0">
424
+ <dim>1</dim>
425
+ <dim>1</dim>
426
+ <dim>1</dim>
427
+ <dim>1025</dim>
428
+ </port>
429
+ </output>
430
+ </layer>
431
+ <layer id="26" name="joint" type="LogSoftmax" version="opset5">
432
+ <data axis="-1" />
433
+ <input>
434
+ <port id="0" precision="FP32">
435
+ <dim>1</dim>
436
+ <dim>1</dim>
437
+ <dim>1</dim>
438
+ <dim>1025</dim>
439
+ </port>
440
+ </input>
441
+ <output>
442
+ <port id="1" precision="FP32" names="joint">
443
+ <dim>1</dim>
444
+ <dim>1</dim>
445
+ <dim>1</dim>
446
+ <dim>1025</dim>
447
+ </port>
448
+ </output>
449
+ </layer>
450
+ <layer id="27" name="joint/sink_port_0" type="Result" version="opset1" output_names="joint">
451
+ <input>
452
+ <port id="0" precision="FP32">
453
+ <dim>1</dim>
454
+ <dim>1</dim>
455
+ <dim>1</dim>
456
+ <dim>1025</dim>
457
+ </port>
458
+ </input>
459
+ </layer>
460
+ </layers>
461
+ <edges>
462
+ <edge from-layer="0" from-port="0" to-layer="16" to-port="0" />
463
+ <edge from-layer="1" from-port="0" to-layer="8" to-port="0" />
464
+ <edge from-layer="2" from-port="0" to-layer="3" to-port="0" />
465
+ <edge from-layer="3" from-port="1" to-layer="25" to-port="0" />
466
+ <edge from-layer="4" from-port="0" to-layer="5" to-port="0" />
467
+ <edge from-layer="5" from-port="1" to-layer="9" to-port="0" />
468
+ <edge from-layer="6" from-port="0" to-layer="7" to-port="0" />
469
+ <edge from-layer="7" from-port="1" to-layer="8" to-port="1" />
470
+ <edge from-layer="8" from-port="2" to-layer="9" to-port="1" />
471
+ <edge from-layer="9" from-port="2" to-layer="11" to-port="0" />
472
+ <edge from-layer="10" from-port="0" to-layer="11" to-port="1" />
473
+ <edge from-layer="11" from-port="2" to-layer="20" to-port="0" />
474
+ <edge from-layer="12" from-port="0" to-layer="13" to-port="0" />
475
+ <edge from-layer="13" from-port="1" to-layer="17" to-port="0" />
476
+ <edge from-layer="14" from-port="0" to-layer="15" to-port="0" />
477
+ <edge from-layer="15" from-port="1" to-layer="16" to-port="1" />
478
+ <edge from-layer="16" from-port="2" to-layer="17" to-port="1" />
479
+ <edge from-layer="17" from-port="2" to-layer="19" to-port="0" />
480
+ <edge from-layer="18" from-port="0" to-layer="19" to-port="1" />
481
+ <edge from-layer="19" from-port="2" to-layer="20" to-port="1" />
482
+ <edge from-layer="20" from-port="2" to-layer="21" to-port="0" />
483
+ <edge from-layer="21" from-port="1" to-layer="24" to-port="0" />
484
+ <edge from-layer="22" from-port="0" to-layer="23" to-port="0" />
485
+ <edge from-layer="23" from-port="1" to-layer="24" to-port="1" />
486
+ <edge from-layer="24" from-port="2" to-layer="25" to-port="1" />
487
+ <edge from-layer="25" from-port="2" to-layer="26" to-port="0" />
488
+ <edge from-layer="26" from-port="1" to-layer="27" to-port="0" />
489
+ </edges>
490
+ <rt_info>
491
+ <Runtime_version value="2025.4.1-20426-82bbf0292c5-releases/2025/4" />
492
+ <conversion_parameters>
493
+ <input_model value="DIR\v3_e2e_rnnt_joint.onnx" />
494
+ <is_python_object value="False" />
495
+ </conversion_parameters>
496
+ </rt_info>
497
+ </net>
voicescribe-model-layout.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "schema_version": 1,
3
+ "generated_at": "2026-04-23T22:39:28Z",
4
+ "layout_key": "gigaam",
5
+ "target_dir": "gigaam-v3-e2e-rnnt-ov",
6
+ "upstream_repo": "Andrewsab/gigaam-v3-e2e-rnnt-ov",
7
+ "upstream_revision": "dff16933a6407dbec85df5f3af8ed8d8a14d9e01",
8
+ "upstream_sha": "dff16933a6407dbec85df5f3af8ed8d8a14d9e01",
9
+ "description": "GigaAM v3 e2e RNN-T Intel Arc/CPU OpenVINO layout.",
10
+ "required_files": [
11
+ "config.json",
12
+ "tokenizer.model",
13
+ "v3_e2e_rnnt_encoder.xml",
14
+ "v3_e2e_rnnt_encoder.bin",
15
+ "v3_e2e_rnnt_decoder.xml",
16
+ "v3_e2e_rnnt_decoder.bin",
17
+ "v3_e2e_rnnt_joint.xml",
18
+ "v3_e2e_rnnt_joint.bin"
19
+ ],
20
+ "allow_patterns": [],
21
+ "license_metadata": {
22
+ "license": "mit",
23
+ "license_tags": [
24
+ "license:mit"
25
+ ],
26
+ "license_files": []
27
+ }
28
+ }