Conwerter commited on
Commit
2341cf2
·
verified ·
1 Parent(s): fa6242a

Add full attribution README for upstream models (GigaAM, 3D-Speaker)

Browse files
Files changed (1) hide show
  1. README.md +96 -0
README.md CHANGED
@@ -1,3 +1,99 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - ru
5
+ tags:
6
+ - automatic-speech-recognition
7
+ - speaker-diarization
8
+ - onnx
9
+ - russian
10
+ - asr
11
+ - gigaam
12
+ - 3d-speaker
13
+ - camplus
14
+ - eres2net
15
+ - mobile
16
+ - offline
17
+ library_name: onnx
18
  ---
19
+
20
+ # ProtocolVoice ASR Models
21
+
22
+ ONNX models for offline Russian speech recognition and speaker diarization,
23
+ packaged for the [ProtocolVoice](https://github.com/protocolvoice) Android app.
24
+
25
+ ## Contents
26
+
27
+ | File | Size | Purpose | Original source | Original license |
28
+ |---|---|---|---|---|
29
+ | `gigaam_v3_e2e_ctc_int8.onnx` | 305 MB | Russian ASR with built-in punctuation | [Sber/SaluteDevices GigaAM](https://github.com/salute-developers/GigaAM) (v3, e2e CTC, int8-quantized) | MIT |
30
+ | `speaker_embedding_camplus.onnx` | 27 MB | Speaker embedding (CAM++) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 |
31
+ | `speaker_embedding.onnx` | 111 MB | Speaker embedding (ERes2Net) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 |
32
+ | `speaker_embedding_v2.onnx` | 68 MB | Speaker embedding (ERes2NetV2) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 |
33
+ | `manifest.json` | < 1 KB | SHA-256 hashes of all models | this repo | Apache-2.0 |
34
+
35
+ ## Important
36
+
37
+ These are NOT new models — this repository **redistributes existing models** in ONNX
38
+ format for convenient mobile delivery. The original authors retain all credit and
39
+ copyright. We did not train, fine-tune, or modify the model weights.
40
+
41
+ **Please cite the original projects, not this redistribution:**
42
+
43
+ - **GigaAM-v3** (ASR): Sber AI, SaluteDevices —
44
+ https://github.com/salute-developers/GigaAM
45
+ - **3D-Speaker** (CAM++, ERes2Net, ERes2NetV2): ModelScope, Alibaba —
46
+ https://github.com/modelscope/3D-Speaker
47
+
48
+ The ONNX conversions and runtime were prepared via [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx)
49
+ (Apache-2.0).
50
+
51
+ ## Why this redistribution
52
+
53
+ The ProtocolVoice mobile app needs to download these models on first run from a
54
+ mirror that:
55
+ - supports files larger than 100 MB without git-lfs limits,
56
+ - has fast CDN reachable from Russia,
57
+ - is the conventional hosting platform for ML models.
58
+
59
+ All redistributed files retain their original licenses. This README serves as
60
+ the required attribution under those licenses.
61
+
62
+ ## How to use
63
+
64
+ Each model is loaded by [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) on
65
+ the device. The ProtocolVoice app:
66
+
67
+ 1. Downloads each `.onnx` file by HTTP from
68
+ `https://huggingface.co/protocolvoice/asr-models/resolve/main/{filename}`,
69
+ 2. Verifies SHA-256 against `manifest.json`,
70
+ 3. Loads via sherpa-onnx for offline inference.
71
+
72
+ You can also use these files directly with sherpa-onnx in any project that
73
+ respects the original licenses.
74
+
75
+ ## Verifying integrity
76
+
77
+ ```python
78
+ import hashlib
79
+
80
+ with open("gigaam_v3_e2e_ctc_int8.onnx", "rb") as f:
81
+ print(hashlib.sha256(f.read()).hexdigest())
82
+ # expected: 0aacb41f70f0f5aaac4b45dd430337b9e16b180f22c72af04db8516e7609c3c0
83
+ ```
84
+
85
+ Hashes for all files are in `manifest.json`.
86
+
87
+ ## License
88
+
89
+ This repository's metadata, README, and packaging scripts are released under
90
+ **Apache-2.0**. Each model file remains under its original license (see the
91
+ table above). By using a model, you accept its original license — not just
92
+ this repository's.
93
+
94
+ ## Removal request
95
+
96
+ If you are an author of one of the upstream projects and have any concerns
97
+ about this redistribution (attribution, hosting, anything else), please open
98
+ a discussion on this Hugging Face repo or email the maintainers — the files
99
+ will be amended or removed as requested.