File size: 3,687 Bytes
ae9287e
 
2341cf2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ae9287e
2341cf2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---

license: apache-2.0
language:
  - ru
tags:
  - automatic-speech-recognition
  - speaker-diarization
  - onnx
  - russian
  - asr
  - gigaam
  - 3d-speaker
  - camplus
  - eres2net
  - mobile
  - offline
library_name: onnx
---


# ProtocolVoice ASR Models

ONNX models for offline Russian speech recognition and speaker diarization,
packaged for the [ProtocolVoice](https://github.com/protocolvoice) Android app.

## Contents

| File | Size | Purpose | Original source | Original license |
|---|---|---|---|---|
| `gigaam_v3_e2e_ctc_int8.onnx` | 305 MB | Russian ASR with built-in punctuation | [Sber/SaluteDevices GigaAM](https://github.com/salute-developers/GigaAM) (v3, e2e CTC, int8-quantized) | MIT |
| `speaker_embedding_camplus.onnx` | 27 MB | Speaker embedding (CAM++) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 |
| `speaker_embedding.onnx` | 111 MB | Speaker embedding (ERes2Net) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 |
| `speaker_embedding_v2.onnx` | 68 MB | Speaker embedding (ERes2NetV2) | [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) | Apache-2.0 |
| `manifest.json` | < 1 KB | SHA-256 hashes of all models | this repo | Apache-2.0 |

## Important

These are NOT new models — this repository **redistributes existing models** in ONNX
format for convenient mobile delivery. The original authors retain all credit and
copyright. We did not train, fine-tune, or modify the model weights.

**Please cite the original projects, not this redistribution:**

- **GigaAM-v3** (ASR): Sber AI, SaluteDevices —
  https://github.com/salute-developers/GigaAM
- **3D-Speaker** (CAM++, ERes2Net, ERes2NetV2): ModelScope, Alibaba —
  https://github.com/modelscope/3D-Speaker

The ONNX conversions and runtime were prepared via [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx)
(Apache-2.0).

## Why this redistribution

The ProtocolVoice mobile app needs to download these models on first run from a
mirror that:
- supports files larger than 100 MB without git-lfs limits,
- has fast CDN reachable from Russia,
- is the conventional hosting platform for ML models.

All redistributed files retain their original licenses. This README serves as
the required attribution under those licenses.

## How to use

Each model is loaded by [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) on
the device. The ProtocolVoice app:

1. Downloads each `.onnx` file by HTTP from
   `https://huggingface.co/protocolvoice/asr-models/resolve/main/{filename}`,
2. Verifies SHA-256 against `manifest.json`,
3. Loads via sherpa-onnx for offline inference.

You can also use these files directly with sherpa-onnx in any project that
respects the original licenses.

## Verifying integrity

```python

import hashlib



with open("gigaam_v3_e2e_ctc_int8.onnx", "rb") as f:

    print(hashlib.sha256(f.read()).hexdigest())

# expected: 0aacb41f70f0f5aaac4b45dd430337b9e16b180f22c72af04db8516e7609c3c0

```

Hashes for all files are in `manifest.json`.

## License

This repository's metadata, README, and packaging scripts are released under
**Apache-2.0**. Each model file remains under its original license (see the
table above). By using a model, you accept its original license — not just
this repository's.

## Removal request

If you are an author of one of the upstream projects and have any concerns
about this redistribution (attribution, hosting, anything else), please open
a discussion on this Hugging Face repo or email the maintainers — the files
will be amended or removed as requested.