Amazon Transcribe - Speaker Diarization Benchmark
- Ubuntu 20.04 (x86_64)
- Python 3.7+
- AWS Account
- Clone the repository:
- Install the dependencies:
Set up the dataset as described in the main readme of the repository.
Create an S3 bucket on you AWS account.
Run the benchmark:
typeis the type of benchmark to run. It can be
datasetis the name of the dataset to use.
data-folderis the path to the folder containing the audio files.
label-folderis the path to the folder containing the ground truth labels.
engineis the name of the engine to benchmark. It must be
aws-profileis the name of the AWS profile to use.
aws-s3-bucket-nameis the name of the S3 bucket to use.