Supertonic-TTS-2 - Text-to-Speech Benchmark
Prerequisites
- Ubuntu 20.04 (x86_64)
- Git
- Python 3.10
- Picovoice Console Account
Usage
- Clone the repository:
- Install the dependencies:
- Download the picoLLM model
For each benchmark a picoLLM model is required to generate responses from the LLM.
The picoLLM model used in the benchmark is llama-3.2-1b-instruct-385 and can be
downloaded from Picovoice Console.
- Run the benchmark:
Hugging face model download commit hash:
- GitHub:
- Repo:
supertone-inc/supertonic. - Commit hash:
6fc89ea89eb29defb0ff2230b77c5a519acfe2a9.
- Repo:
- Hugging face model download commit hash:
- Repo:
Supertone/supertonic-2. - Commit hash:
75e6727618a02f323c720cba9478152d4bc16ca4.
- Repo:
For core hour ratio & latency metrics:
For peak memory metric:
Replace ${SUPERTONIC_REPO_DIR} with the path to Supertonic-TTS-2's repo. E.g. --supertonictts-repo-dir ~/supertonic/. Replace ${SUPERTONIC_ONNX_DIR} with the path to Supertonic-TTS-2's repo. E.g. --supertonictts-onnx-dir ~/supertonic/py/assets/onnx/. Replace ${SUPERTONIC_VOICE_STYLE_PATH} with the path to Supertonic-TTS-2's repo. E.g. --voice-style-path ~/supertonic/py/assets/voice_styles/M1.json.
Replace ${PICOLLM_MODEL_PATH} with the path to the model you downloaded.
Replace ${PV_ACCESS_KEY} with your AccessKey obtained from Picovoice Console.
Everyone who signs up for Picovoice Console receives a unique AccessKey.