Choosing the best Natural Language Understanding (NLU) software is difficult. In a nutshell, NLU detects intents and intent details. However, every enterprise has different requirements and priorities. After the positive reaction to Picovoice’s open-source NLU benchmark, we decided to publish a buyer’s guide and evaluate top FOSS (free and open source software) and paid NLU platforms frequently used for building voice products. The article covers Rasa, Snips, Spokestack, Wit.ai by Facebook, Dialogflow by Google, Lex by Amazon, Luis by Microsoft, Rhino by Picovoice and Watson by IBM.
To learn more about NLU terminology, please see the articles discussing NLU 101, differences among NLP, NLU and NLG, NLP and Voice Recognition and SLU and NLU.
Top Free Natural Language Understanding Software
Rasa : Rasa is a free and open-source NLU that allows developers to train, deploy and run products on their servers. It is an advantage over the cloud-based platforms as it saves network time and offers faster experiences. Rasa NLU processes text inputs. Although the core software is free, Rasa offers paid support and consulting services. Thus, enterprises can start building their prototypes for free and then purchase the required services when they decide.
Snips : Snips no longer maintains its software and offers its platform to the public after being acquired by Sonos. However, its repo is still available on GitHub and used by developers. Its lightweight and efficient software can run even on IoT devices. Since the data doesn’t need to go to the cloud, it’s much faster and completely private. Nevertheless, the repository became stale for years, so one shouldn’t expect anything from maintainers, including improvements and technical support.
Wit.ai : Wit.ai is a free platform and now requires a Facebook account after being acquired by Facebook. If one doesn’t (want to) have a Facebook account or deletes it, then they cannot use Wit. If an app is open, not private, then intents, entities and expressions become accessible to the community but interactions. Wit doesn’t put a limit on request numbers, but there are some implicit usage limitations. Wit is known mainly for chatbots but also supports voice-controlled applications for IoT. There is no paid support or consulting services provided by Wit.
Spokestack : Spokestack platform offers wake word, ASR, NLU, and TTS. Similar to Snips, its NLU directly infers intents from speech and does not require cloud connectivity to be processed. It makes Spokestack fast and private. Spokestack used to offer additional services and support under its paid tier. However, Spokestack recently archived its repositories. It’s worth checking with the maintainers to understand the SLAs before starting.
Top Paid Natural Language Understanding Software
Dialogflow : Google, after acquiring API.ai has started offering text-based chatbot capabilities and speech recognition under the name Dialogflow. Dialogflow records and sends voice data to Google’s servers for processing. Using Dialogflow is a convenient choice for current GCP customers as they know the infrastructure and have contacts to support them. For new customers, GCP offers credit to be used anytime within the first year.
Lex : Amazon’s Lex is one of the AWS offerings. Like Dialogflow, Lex offers text and voice capabilities and processes speech and understanding separately in its cloud. Sending voice data to Amazon’s cloud for processing slows the process and impacts user privacy. AWS covers the first couple of thousand requests for free every month in the first year.
LUIS : Microsoft’s LUIS (Language Understanding Intelligent Service) is similar to Amazon Lex and Google Dialogflow. It also started as a text-based (chatbot) service, adding voice features. Hence, it requires voice data to be recorded and sent to Microsoft servers. It can be easily integrated with other Azure services, such as Azure Bot Services and purchased with other Azure subscriptions.
Rhino: Picovoice’s Rhino, similar to Snips and Spokestack, infers intents and intent details directly from speech and runs anywhere, including IoT devices and servers. Thus, it enables faster, reliable and private experiences. Rhino is voice-based and does not support text-based services (i.e. chatbots). Picovoice offers unlimited voice interactions and charges based on the number of users. Its Free Tier is for up to 3 devices or users. Check out a detailed cost and accuracy comparison for Rhino.
Watson : IBM’s Watson Natural Language Understanding offers similar features and capabilities as Dialogflow, Lex and LUIS offers. It can be used for text and voice interactions and integrated into various platforms. It’s easier to get started for IBM Cloud customers. Watson also processes voice data in IBM’s servers, and IBM keeps it.