Regulations, such as GDPR, consider audio data of humans speaking and their text transcripts as personal and sensitive data. The human voice carries
Personally Identifiable Information (PII) - that’s how Speaker Recognition technology works! Moreover, the content of the audio may also contain
PII depending on the context. Therefore, speech-to-text providers typically have dedicated sections on
Compliance which detail how their engines handle user-provided data. They may list how responsible their models are or certifications from the organizations such as
Compliance are crucial for speech-to-text engines. Here are some key points to consider:
1. Data Security
Speech data should be protected from unauthorized access, breaches, or misuse. Speech-to-text vendors should ensure data security by implementing encryption during transmission and storage, robust access controls, regular security audits, and adherence to industry best practices.
Speech-to-text engines should get user consent before processing their data by providing transparent information to users about how their data is collected, used, stored, and shared.
Speech-to-text engines should comply with relevant legal and regulatory frameworks, such as data protection laws, e.g., GDPR, and industry-specific regulations, e.g., HIPAA. Hence the list of certifications mentioned in the first paragraph.
4. Anonymization and Aggregation
Speech-to-text engines should employ techniques like anonymization or aggregation to dissociate individual user identities from speech data if they need user data for training.
5. User Control and Transparency
Speech-to-text engines should offer users control over their data, allowing them to opt-in or out of data collection, specify data retention periods, and exercise choices regarding data usage. Ideally, the default setting should be opt-out, and enterprises should provide clear information about the implications of opting in. For instance, human reviewers may listen to their conversations to prepare the data for training should be clear.
6. Ethical Considerations
Speech-to-text engines should adhere to ethical guidelines, ensuring fairness, inclusivity, and non-discrimination in speech processing. In addition, models should have diverse training datasets to minimize biases and avoid perpetuating societal inequalities.
Compliance is a challenging task. If you use cloud speech-to-text APIs, you need to ask these questions to your vendor. However, vendors' responses may not reflect the reality. For example, FTC’s complaint against Alexa was that Amazon allegedly engaged in deceptive practices by claiming that Alexa was privacy-conscious. In reality, Alexa’s data collection and use practices violate Section 5 of the FTC Act and the COPPA Rule. Amazon agreed to pay $25M to settle .
How to ensure privacy?
Use on-device speech-to-text engines like Leopard and Cheetah. The easiest way to protect your data is not to share! Picovoice engines bring voice AI to data rather than sending data to 3rd parties to process it.
Picovoice does not have a list of certifications because Picovoice does not have access to audio data processed by Picovoice engines or their outputs, e.g., transcripts. Picovoice applies privacy-by-design principles. It takes proactive measures rather than reactive, ensuring the
Compliance of the audio data and transcripts by not creating
Compliance risks and threats in the first place.
Build with on-device speech-to-text models and ensure