Voice picking, also known as picking by voice, voice-directed picking, speech-based picking, pick by voice, or pick to voice, is a paperless, hands-free, and eyes-free order picking system for warehouse processes. Pickers are equipped with a device, mainly a headset or a dedicated terminal. Typically, these devices also have audio front-end solutions as old-school voice technology does not perform in loud environments. The devices can support touch, RFID and barcode scanners in addition to voice. Mostly, automatic speech recognition (ASR) and natural language understanding (NLU) are used to guide workers to locate, pick, and place items. Picking by voice increases productivity and reduces error by freeing the hands and eyes of the users.

Since the 90s, large distribution centers and warehouses have been using voice picking technology. Voice-directed picking increases productivity by up to 35%, decreases error by up to 90% and shortens the training time from days to minutes. [1] [2] Voice picking technology contributes to the bottom line of enterprises by increasing revenue and decreasing costs.

voice-directed warehousing solution market in 2031
Decrease in picking errors with voice picking
Increase in productivity with voice picking

What are the benefits of voice picking

  • Reduced training time: Workers can start after little training, in most cases less than a day, as simple verbal instructions tell them what to do on the job. Therefore, warehouses with seasonal workers and high employee turnover save significant time.

  • Minimized use of tools and distractions: Pickers no longer pick up, read, and put down instructions to perform tasks. They stay focused on visually locating correct items, picking the right quantity, and placing them. Tasks such as data entry and following written instructions result in picking the wrong item or quantity and wasting time between bins. Simplifying the process and limiting distractions increase speed and accuracy.

  • Improved workplace safety: Leaving pickers’ hands free, especially when dealing with heavy or sharp objects or wearing protective gloves, improves workplace safety significantly.

The voice-directed warehousing solution market is expected to grow from $1.4 billion in 2020 to $4.8 billion in 2031. In this decade, many enterprises will adopt voice-directed systems. [3] However, picking the right vendor will be a barrier initially.

Three things to know before selecting a vendor for voice picking:

  • Connectivity: Cloud-based voice recognition solutions require superior WIFI coverage across the warehouse to avoid disruptions. Remembering the time you wait for Siri or Alexa to respond even when you ask the time of the day, you do not want poor latency or poor responsiveness to cause delays in your operations. Any connectivity issues in the warehouse and with the internet or cloud service provider will hinder the picker’s productivity. Picovoice offers a consistent and guaranteed real-time experience by running fully on-device without any network dependency. Processing voice commands offline enables pickers to work with no disruption.

  • Ease of Use: Demographics of the warehouse workers vary: Different age groups, educational backgrounds, dialects and accents. Finding a solution that works across different accents and in noisy warehouse environments with high accuracy and that requires limited to no technology literacy is essential. The interface should be not only simple but also flexible in case changes are needed. Vendors, requiring data collection to train AI models cannot offer such flexibility. Picovoice’s Porcupine Wake Word Engine eliminates the need for a push-to-talk and Rhino Speech-to-Intent directly captures intents from commands. Picovoice technology achieves state-of-the-art accuracy, as proven by open-source benchmarks, despite noise and reverberation. Enterprises can train highly accurate voice AI models on Picovoice Console instantly and start leading the voice revolution immediately.

  • Cost: Every IT decision-maker knows that the cost of a product is different from the total cost of ownership. Two things may skyrocket the total cost of ownership significantly:

    1. Development Cost: It may take up to 6 months to launch a prototype if a vendor needs to collect data to train models specific to a use case. The process starts again if changes are needed after prototyping. Some vendors also assign a project manager to customers and charge consulting fees instead of providing out-of-the-box solutions and flexibility to customize. Paying upfront fees despite a lack of PoC adds additional risks.

    Picovoice reduces time-to-market from months to days. One can build a picking by voice prototype in days. Picovoice’s hardware-agnostic technology runs anywhere including Android and Windows. Use your existing voice picking headset or try wearables such as watches and vests. Modern SDKs supported by Picovoice reduce the development time and enable integrations with existing systems such as WMS and ERPs.

    2. Cloud Bill: Cloud providers charge based on usage: number of API calls and minutes of processed voice data. For prototyping, it’s not a problem. However, as workers start picking millions of items, bills go up. For example, working with Google will cost an average warehouse $750,000 annually. (Assuming a warehouse with 1000 employees working for a 7-hour shift per day)

What’s Next?

Check out the live demo below to see how Picovoice technology is used for voice picking applications. Picovoice does not just enable voice-guided picking for logistics but also grocery store pick-ups or other voice-directed applications such as industrial voice assistants and voice-guided inspection. When you’re ready, start building immediately!

Press the microphone button to activate the demo.