Azure - Speaker Diarization Benchmark
Prerequisites
- Ubuntu 20.04 (x86_64)
- Git
- Python 3.7+
- PIP
- Azure Account
Usage
- Clone the repository:
- Install the dependencies:
Set up the dataset as described in the main readme of the repository.
A client library for the Speech to Text REST API should be generated, as outlined in the documentation.
Create an Azure Storage account on you Azure account.
Run the benchmark:
Where:
type
is the type of benchmark to run. It can beACCURACY
,CPU
, orMEMORY
.dataset
is the name of the dataset to use.data-folder
is the path to the folder containing the audio files.label-folder
is the path to the folder containing the ground truth labels.engine
is the name of the engine to benchmark. It must beAZURE_SPEECH_TO_TEXT
.azure-storage-account-name
is the name of the Azure Storage account to use.azure-storage-account-key
is the key of the Azure Storage account to use.azure-storage-container-name
is the name of the Azure Storage container to use.azure-subscription-key
is the subscription key to use.azure-region
is the region to use.