Releases: idiap/coqui-ai-TTS
Releases · idiap/coqui-ai-TTS
v0.27.3
What's Changed
Features
- Return sentence level timestamps for multi-sentence input by @santosh-r24 in #520 (documentation)
Fixes
- Support Pytorch 2.9 by @eginhard in #525
- fix(xtts): update streaming code to transformers>=4.57 by @eginhard in #505
- fix(xtts): add over limit text display by @zkhan122 in #495
- Handle romanization for Fairseq models by @eginhard in #506
- docs(formatters): show expected metadata format, improve error handling by @eginhard in #511 (documentation)
- fix(bark): ensure tensors are on same device when not using reference audio by @eginhard in #512
- fix(audio): create parent folders, improve error message by @eginhard in #513
- fix(docker): correct torch backend selection by @eginhard in #524
- Replace
jiebawithspacy_pkusegfor Chinese word segmentation by @langfod in #504
New Contributors
- @langfod made their first contribution in #504
- @zkhan122 made their first contribution in #495
- @santosh-r24 made their first contribution in #520
Full Changelog: v0.27.2...v0.27.3
v0.27.2
What's Changed
Features
Fixes
- docs: fix the Double Decoder Consistency link which was not working by @Olexandr88 in #500
New Contributors
- @Olexandr88 made their first contribution in #500
Full Changelog: v0.27.1...v0.27.2
v0.27.1
What's Changed
Features
Fixes
- fix(recipes): wrap scripts with main() functions by @eginhard in #462
- fix(tortoise): remove low vram mode by @eginhard in #464
- fix(bark): allow loading npz voices from the original model by @eginhard in #461
- build: restrict to torch/torchaudio<2.9 by @eginhard in #488
New Contributors
Full Changelog: v0.27.0...v0.27.1
v0.27.0
What's Changed
Features
- Speaker caching for cloned voices by @eginhard in #438
- For usage details see https://coqui-tts.readthedocs.io/en/latest/cloning.html
⚠️ The old caching mechanism of Bark and Tortoise has been removed, switch to the new one instead.
- Provide
synthesize()method with a common interface for every TTS model by @eginhard in #453⚠️ Deprecatespeaker_idargument ofsynthesize(), usespeakerinstead.⚠️ Deprecateconfigargument ofsynthesize(), it can safely be left out.
- Add OpenAI-compatible endpoint to the server by @teddybear082 in #421
See https://coqui-tts.readthedocs.io/en/latest/server.html#openai-compatible-endpoint
Fixes
- Update coqui-tts-trainer to 0.3.0 to fix numerous training-related bugs by @eginhard in #423
For the full list of fixes see https://github.com/idiap/coqui-ai-Trainer/releases/tag/v0.3.0 - fix(configs): add default padding character by @eginhard in #425
- Fix KeyError when speaker_id is empty string in TTS server by @mehulanshumali in #436
- Fix: Update xtts finetuning Colab to support Gradio 5 by @eulphean in #408
- Bring back compatibility with numpy1 by @MarwanMashra in #413
- fix: update XTTS/Tortoise GPT code for HF transformers 4.52+ by @eginhard in #414
- build: lower minimum pytorch version back to 2.1 by @eginhard in #432
- refactor(phonemizer): replace mecab-python3 with fugashi for japanese by @eginhard in #417
- docs: add Docker Compose config by @KishoOoOoOo in #411
New Contributors
- @KishoOoOoOo made their first contribution in #411
- @eulphean made their first contribution in #408
- @MarwanMashra made their first contribution in #413
- @teddybear082 made their first contribution in #421
- @mehulanshumali made their first contribution in #436
Full Changelog: v0.26.2...v0.27.0
v0.26.2
What's Changed
Fixes
- fix(xtts): restrict transformers to <4.52 to avoid corrupted output by @eginhard in #396
- fix(xtts): fix Colab demo package installation by @eginhard in #396
- fix(xtts): provide more helpful error message when reference audio is too short by @eginhard in #396
- fix: don't convert to int to avoid constant value in onnx exports by @eginhard in #390
Full Changelog: v0.26.1...v0.26.2
v0.26.1
What's Changed
Features
Fixes
- Switch to Numpy>=2, Pytorch>=2.3 by @fabiocat93 in #346
- fix(xtts): update colab finetuning notebook by @eginhard in #377
- fix(forward_tts): ensure tensor 'g' is on the same device as 'x' by @btseee in #378
- Remove Spacy dependency by @eginhard in #383
New Contributors
- @fabiocat93 made their first contribution in #346
- @btseee made their first contribution in #378
- @Sleuth56 made their first contribution in #351
Full Changelog: v0.26.0...v0.26.1
v0.26.0
What's Changed
Features
- Added speaker_wav parameter to the server by @shavit in #295
- feat(api): support setting speed by @eginhard in #316
- Added new persian-tts-female-vits model by @DrewThomasson in #332
Fixes
- docs: clean up server README by @junland in #272
- fix: notify users when wrong coqpit package is installed by @eginhard in #294
- Refactor for compatibility with transformers>=4.47 by @JohnnyStreet and @eginhard in #319
- Support use of
--continue_pathto resume XTTS training by @eginhard in #270 - Drop Python 3.9 support by @eginhard in #255
Dev
- Switch remaining CLI tests to Python, separate integration tests by @eginhard in #276
- Added paths-ignore in workflows by @DrewThomasson in #334
New Contributors
- @junland made their first contribution in #272
- @DrewThomasson made their first contribution in #332
Full Changelog: v0.25.3...v0.26.0
v0.25.3
v0.25.2
What's Changed
Features
- Add kNN-VC model by @eginhard in #256
- Support all Coqui TTS models in the server by @eginhard in #252
- Allow both Path and strings where possible and add type hints by @eginhard in #210
- feat(manager): print download location when listing models by @eginhard in #213
Fixes
- fix(bark): handle broken paths in config by @eginhard in #253
- fix(openvoice): correctly set utterance length by @eginhard in #260
- fix(bin): log to stdout in cli tools by @eginhard in #217
- fix(vc): support both cpu and cuda by @eginhard in #244
- fix(xtts): voice_dir should remain None if not specified by @eginhard in #224
- Fix num2words call using non-standard lang code by @SkaceKamen in #237
- chore: remove unused callback code by @eginhard in #229
- fix: convert >35 digit English numbers digit-by-digit by @lostways in #240
- Change old docker image url to the one that is relevant to this repo in README.md by @DelovoiDC in #243
- test: switch from nose2 to pytest by @eginhard in #208
- Update plot_embeddings_umap notebook by @eginhard in #221
- Improve documentation by @eginhard in #207
New Contributors
- @SkaceKamen made their first contribution in #237
- @lostways made their first contribution in #240
- @DelovoiDC made their first contribution in #243
Full Changelog: v0.25.1...v0.25.2
WavLM-HiFiGAN vocoders from kNN-VC
- HiFiGAN vocoders for WavLM features trained on LibriSpeech100 from https://github.com/bshall/knn-vc (MIT license)