Azure - Speaker Diarization Benchmark

Prerequisites

Ubuntu 20.04 (x86_64)
Git
Python 3.7+
PIP
Azure Account

Usage

Clone the repository:

git clone https://github.com/Picovoice/speaker-diarization-benchmark.git

Install the dependencies:

pip3 install -r requirements.txt

Set up the dataset as described in the main readme of the repository.
A client library for the Speech to Text REST API should be generated, as outlined in the documentation.
Create an Azure Storage account on you Azure account.
Run the benchmark:

python3 benchmark.py \
    --type {ACCURACY|CPU|MEMORY} \
    --dataset ${DATASET} \
    --data-folder ${DATA_FOLDER} \
    --label-folder ${LABEL_FOLDER} \
    --engine AZURE_SPEECH_TO_TEXT \
    --azure-storage-account-name ${AZURE_STORAGE_ACCOUNT_NAME} \
    --azure-storage-account-key ${AZURE_STORAGE_ACCOUNT_KEY} \
    --azure-storage-container-name ${AZURE_STORAGE_CONTAINER_NAME} \
    --azure-subscription-key ${AZURE_SUBSCRIPTION_KEY} \
    --azure-region ${AZURE_REGION}

Where:

type is the type of benchmark to run. It can be ACCURACY, CPU, or MEMORY.
dataset is the name of the dataset to use.
data-folder is the path to the folder containing the audio files.
label-folder is the path to the folder containing the ground truth labels.
engine is the name of the engine to benchmark. It must be AZURE_SPEECH_TO_TEXT.
azure-storage-account-name is the name of the Azure Storage account to use.
azure-storage-account-key is the key of the Azure Storage account to use.
azure-storage-container-name is the name of the Azure Storage container to use.
azure-subscription-key is the subscription key to use.
azure-region is the region to use.

Was this doc helpful?

Issue with this doc?