Regulations, such as GDPR, consider audio data of human speech and their respective text transcripts as personal and sensitive data. The human voice carries Personally Identifiable Information
(PII
) - that’s how Speaker Recognition technology works! The content of the audio can also contain PII
, depending on the context. Therefore, Speech-to-Text providers typically have dedicated sections on Data Privacy
, Security
, and Compliance
on their websites that detail how their engines handle user-provided data. They may list how their models are responsible, or certifications from regulatory organizations (such as SOC
, FedRAMP
, HIPAA, and HITECH
). This is because Data Privacy
, Security
and Compliance
are crucial for Speech-to-Text engines. Below are some key points to consider:
1. Data Security
Speech data should be protected from unauthorized access, breaches, or misuse. Speech-to-text vendors should ensure data security by implementing encryption during transmission and storage, robust access controls, regular security audits, and adherence to industry best practices.
2. Privacy
Speech-to-text engines should get user consent before processing their data by providing transparent information to users about how their data is collected, used, stored, and shared.
3. Compliance
Speech-to-text engines should comply with relevant legal and regulatory frameworks, such as data protection laws, e.g., GDPR, and industry-specific regulations, e.g., HIPAA. Hence the list of certifications mentioned in the first paragraph.
4. Anonymization and Aggregation
Speech-to-text engines should employ techniques like anonymization or aggregation to dissociate individual user identities from speech data if they need user data for training.
5. User Control and Transparency
Speech-to-text engines should offer users control over their data, allowing them to opt-in or out of data collection, specify data retention periods, and exercise choices regarding data usage. Ideally, the default setting should be opt-out, and enterprises should provide clear information about the implications of opting in. For instance, human reviewers may listen to their conversations to prepare the data for training should be clear.
6. Ethical Considerations
Speech-to-text engines should adhere to ethical guidelines, ensuring fairness, inclusivity, and non-discrimination in speech processing. In addition, models should have diverse training datasets to minimize biases and avoid perpetuating societal inequalities.
Ensuring speech-to-text Data Privacy
, Security
, and Compliance
is a challenging task. If you use cloud speech-to-text APIs, you need to ask these questions to your vendor. However, vendors' responses may not reflect the reality. For example, FTC’s complaint against Alexa was that Amazon allegedly engaged in deceptive practices by claiming that Alexa was privacy-conscious. In reality, Alexa’s data collection and use practices violate Section 5 of the FTC Act and the COPPA Rule. Amazon agreed to pay $25M to settle.
How to ensure privacy?
Use on-device speech-to-text engines like Leopard and Cheetah. The easiest way to protect your data is not to share! Picovoice engines bring voice AI to data rather than sending data to 3rd parties to process it.
Picovoice does not have a list of certifications because Picovoice does not have access to audio data processed by Picovoice engines or their outputs, e.g., transcripts. Picovoice applies privacy-by-design principles. It takes proactive measures rather than reactive, ensuring the Privacy
, Security
, and Compliance
of the audio data and transcripts by not creating Privacy
, Security
, and Compliance
risks and threats in the first place.
Build with on-device speech-to-text models and ensure Data Privacy
, Security
, and Compliance
!