Add full attribution README for upstream models (GigaAM, 3D-Speaker)

2341cf2 verified 9 days ago

3.69 kB

	---
	license: apache-2.0
	language:
	- ru
	tags:
	- automatic-speech-recognition
	- speaker-diarization
	- onnx
	- russian
	- asr
	- gigaam
	- 3d-speaker
	- camplus
	- eres2net
	- mobile
	- offline
	library_name: onnx
	---

	# ProtocolVoice ASR Models

	ONNX models for offline Russian speech recognition and speaker diarization,
	packaged for the [ProtocolVoice](https://github.com/protocolvoice) Android app.

	## Contents

	\| File \| Size \| Purpose \| Original source \| Original license \|
	\|---\|---\|---\|---\|---\|
	\| `gigaam_v3_e2e_ctc_int8.onnx` \| 305 MB \| Russian ASR with built-in punctuation \| [Sber/SaluteDevices GigaAM](https://github.com/salute-developers/GigaAM) (v3, e2e CTC, int8-quantized) \| MIT \|
	\| `speaker_embedding_camplus.onnx` \| 27 MB \| Speaker embedding (CAM++) \| [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) \| Apache-2.0 \|
	\| `speaker_embedding.onnx` \| 111 MB \| Speaker embedding (ERes2Net) \| [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) \| Apache-2.0 \|
	\| `speaker_embedding_v2.onnx` \| 68 MB \| Speaker embedding (ERes2NetV2) \| [modelscope/3D-Speaker](https://github.com/modelscope/3D-Speaker) \| Apache-2.0 \|
	\| `manifest.json` \| < 1 KB \| SHA-256 hashes of all models \| this repo \| Apache-2.0 \|

	## Important

	These are NOT new models — this repository redistributes existing models in ONNX
	format for convenient mobile delivery. The original authors retain all credit and
	copyright. We did not train, fine-tune, or modify the model weights.

	Please cite the original projects, not this redistribution:

	- GigaAM-v3 (ASR): Sber AI, SaluteDevices —
	https://github.com/salute-developers/GigaAM
	- 3D-Speaker (CAM++, ERes2Net, ERes2NetV2): ModelScope, Alibaba —
	https://github.com/modelscope/3D-Speaker

	The ONNX conversions and runtime were prepared via [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx)
	(Apache-2.0).

	## Why this redistribution

	The ProtocolVoice mobile app needs to download these models on first run from a
	mirror that:
	- supports files larger than 100 MB without git-lfs limits,
	- has fast CDN reachable from Russia,
	- is the conventional hosting platform for ML models.

	All redistributed files retain their original licenses. This README serves as
	the required attribution under those licenses.

	## How to use

	Each model is loaded by [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) on
	the device. The ProtocolVoice app:

	1. Downloads each `.onnx` file by HTTP from
	`https://huggingface.co/protocolvoice/asr-models/resolve/main/{filename}`,
	2. Verifies SHA-256 against `manifest.json`,
	3. Loads via sherpa-onnx for offline inference.

	You can also use these files directly with sherpa-onnx in any project that
	respects the original licenses.

	## Verifying integrity

	```python
	import hashlib

	with open("gigaam_v3_e2e_ctc_int8.onnx", "rb") as f:
	print(hashlib.sha256(f.read()).hexdigest())
	# expected: 0aacb41f70f0f5aaac4b45dd430337b9e16b180f22c72af04db8516e7609c3c0
	```

	Hashes for all files are in `manifest.json`.

	## License

	This repository's metadata, README, and packaging scripts are released under
	Apache-2.0. Each model file remains under its original license (see the
	table above). By using a model, you accept its original license — not just
	this repository's.

	## Removal request

	If you are an author of one of the upstream projects and have any concerns
	about this redistribution (attribution, hosting, anything else), please open
	a discussion on this Hugging Face repo or email the maintainers — the files
	will be amended or removed as requested.