Speech-to-text, also known as automatic speech recognition (ASR), is a technology that converts spoken language into written text. Speech-to-text technology automates transcription, saving time and enhancing accessibility for a wide range of applications. Incorporating speech-to-text functionality into your application can greatly enhance user engagement and accessibility.

In this article, we will walk you through the process of integrating speech-to-text into a Django application using Picovoice's Leopard Speech-to-Text engine.

Note that it is also possible to perform speech-to-text directly in the front-end with Leopard. Check out the Leopard Web Quick Start guide to get started.

1. Prerequisites

Sign up for a free Picovoice Console account. Once you've created an account, copy your AccessKey on the main dashboard.

Also make sure that you have Python and Django installed on your device, and that your version of Django supports your version of Python. You can check if they are installed with the following commands:

2. Create a Django Project

If you don't already have a Django project, start by creating one with the following command:

3. Create a Django App

Within your project, create a new Django app.

In your /myproject/settings.py file, add 'myapp' to the INSTALLED_APPS list.

4. Install the Leopard Python SDK

Install pvleopard:

5. Create a View

Replace the contents of /myapp/views.py with the following code. Make sure to replace ${ACCESS_KEY} with your actual AccessKey.

This view function (transcribe_audio) will receive an audio file sent by a template (we will set this up in the next step), transcribe it using pvleopard, and send the transcript back to the template to be displayed.

Note that Leopard also returns a timestamp and confidence level for every word in the transcript. This will be printed in your terminal.

Inside the /myapp directory, create a /urls.py file and add the following code:

In /myproject/urls.py, add the following line to the urlpatterns list:

Make sure to also import include from django.urls.

6. Create a Template

Inside the /myapp directory, create a /templates directory. Inside this /templates directory, create a transcribe_audio.html HTML file with the following content in the <body> tag:

This template simply allows you to send an audio file to your view function (transcribe_audio), which will transcribe the audio and return the transcript to be displayed.

7. Run Project

Start the development server:

Click the link in your terminal to access the development server in your browser. Finally, upload an audio file to view the transcript!

Leopard Speech-to-Text with Django

Further Reading

To learn more about Leopard Speech-to-Text, check out the Leopard Speech-to-Text product page or refer to the Leopard Speech-to-Text Python SDK quick start guide.