The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation. Linguists, actors, constructed language creators, foreign language students, teachers and voice AI researchers use the IPA.

The IPA represents sounds in oral languages, such as phones, phonemes, intonation, and the separation between words and syllables, not letters of the alphabet. IPA can have multiple representations for the same letter, and distinct letters can represent the same sound. For example, American English has 26 letters and 44 phonemes, and “Ʊ“ exists in both “Put” and “Look.”

The International Phonetic Alphabet is different from the NATO Phonetic Alphabet, which focuses on the representation of alphabet letters with code words to simplify communication, especially in the military, aviation and maritime.

IPA in Automatic Speech Recognition

Using the IPA in speech recognition is quite common. Most vendors offer self-service interfaces to fine-tune automatic speech recognition (ASR) models so that users adjust models using IPA transcriptions to improve accuracy.

Speech recognition technology uses pronunciation and matches the spoken words to the ones in the model. Models, by default, have thousands of words in their lexicons. Leopard Speech-to-Text, for example, has over 300,000 words. However, the generic models do not contain all the words, especially industry-specific jargon, proper nouns or made-up words. If STT doesn’t have the word, it tries to generate it automatically, resulting in accuracy problems. Users fine-tune generic models for their use cases on self-service tools by “teaching” written (letters) and spoken (IPA symbols) versions of words.

The image below shows the Picovoice Console interface to add new words. IPA recommendations show up. Users can select one of the recommendations or add their version.

Picovoice console

How to Transcribe Words into IPA

Dictionaries: Every dictionary has the IPA transcription of words next to their written form. Google Translate and Google Search also provide IPA transcriptions based on the Oxford Dictionary. Below are some reputable dictionaries for IPA transcriptions:

Apps or Websites: If one needs the IPA transcription of sentences and paragraphs, checking every word from a dictionary is not very efficient. For those cases, websites such as tophonetics and easypronunciation can be helpful.

Repositories: To integrate the IPA transcriptions into applications, preparing or using a database is more efficient. For example, the ARPABET-based CMU Pronouncing Dictionary, CMUdict, is the go-to resource for many open-source projects. ARPABET uses ASCII characters, not IPA, yet, a small script can map it to IPA. Moby is another popular project. However, it is no longer maintained.

Understanding the differences between the English IPA Transcription and that of other languages, such as, French IPA Transcription, transcription de l'alphabet phonétique international (API) français, German IPA Transcription, Deutschen Aussprache-wörterbuch, or Italian IPA transcription, alfabeto fonetico internazionale per l'italiano, can help developers while building multilingual applications.