Text-to-Speech
F5-TTS
Urdu
tts
open-bible
urdu
luel commited on
Commit
78b7eca
·
verified ·
1 Parent(s): 9bcd96f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -4
README.md CHANGED
@@ -52,22 +52,38 @@ pip install git+https://github.com/SWivid/F5-TTS.git
52
  Download the checkpoint and run inference:
53
 
54
  ```python
 
55
  from huggingface_hub import hf_hub_download
56
- from f5_tts.api import F5TTS
 
 
57
 
58
  repo_id = "multilingual-tts/F5-TTS-OpenBible-Urdu"
59
  ckpt = hf_hub_download(repo_id, "model_last.pt")
60
  vocab = hf_hub_download(repo_id, "vocab.txt")
61
  config = hf_hub_download(repo_id, "F5-TTS_OpenBible_Urdu.yaml")
62
 
63
- model = F5TTS(ckpt_file=ckpt, vocab_file=vocab, model_cfg=config)
 
 
 
 
 
 
 
 
 
64
 
65
  # Supply your own clean reference clip — 5–10 s, single speaker and its transcription.
66
  ref_audio = "/path/to/your-urdu-clip.wav"
67
  ref_text = "Exact transcription of the clip"
68
  gen_text = "..." # text to synthesise in Urdu
69
 
70
- wav, sr, _ = model.infer(ref_audio=ref_audio, ref_text=ref_text, gen_text=gen_text)
 
 
 
 
71
  ```
72
 
73
  ## Training data
@@ -101,4 +117,4 @@ Evaluated alongside other Open-Bible TTS systems on character/word error rate
101
  [open-bible-models](https://github.com/davidguzmanr/open-bible-models) repository
102
  for the evaluation pipeline and the
103
  [open-bible-surveys](https://github.com/davidguzmanr/open-bible-surveys) repository
104
- for the human-listening survey methodology.
 
52
  Download the checkpoint and run inference:
53
 
54
  ```python
55
+ import torch
56
  from huggingface_hub import hf_hub_download
57
+ from hydra.utils import get_class
58
+ from omegaconf import OmegaConf
59
+ from f5_tts.infer.utils_infer import infer_process, load_model, load_vocoder, preprocess_ref_audio_text
60
 
61
  repo_id = "multilingual-tts/F5-TTS-OpenBible-Urdu"
62
  ckpt = hf_hub_download(repo_id, "model_last.pt")
63
  vocab = hf_hub_download(repo_id, "vocab.txt")
64
  config = hf_hub_download(repo_id, "F5-TTS_OpenBible_Urdu.yaml")
65
 
66
+ device = "cuda" if torch.cuda.is_available() else "cpu"
67
+
68
+ model_cfg = OmegaConf.load(config)
69
+ model_cls = get_class(f"f5_tts.model.{model_cfg.model.backbone}")
70
+
71
+ vocoder = load_vocoder(vocoder_name="vocos", is_local=False, device=device)
72
+ model = load_model(
73
+ model_cls, model_cfg.model.arch, ckpt,
74
+ mel_spec_type="vocos", vocab_file=vocab, use_ema=True, device=device,
75
+ )
76
 
77
  # Supply your own clean reference clip — 5–10 s, single speaker and its transcription.
78
  ref_audio = "/path/to/your-urdu-clip.wav"
79
  ref_text = "Exact transcription of the clip"
80
  gen_text = "..." # text to synthesise in Urdu
81
 
82
+ ref_audio_proc, ref_text_proc = preprocess_ref_audio_text(ref_audio, ref_text)
83
+ wav, sr, _ = infer_process(
84
+ ref_audio_proc, ref_text_proc, gen_text, model, vocoder,
85
+ mel_spec_type="vocos", device=device,
86
+ )
87
  ```
88
 
89
  ## Training data
 
117
  [open-bible-models](https://github.com/davidguzmanr/open-bible-models) repository
118
  for the evaluation pipeline and the
119
  [open-bible-surveys](https://github.com/davidguzmanr/open-bible-surveys) repository
120
+ for the human-listening survey methodology.