Improving German PII Redaction Accuracy for GDPR Compliance

I am currently auditing our Sentiment Analysis data for GDPR compliance. I have noticed that the ‘PII Redaction’ feature in Genesys Cloud is occasionally missing sensitive information (like home addresses) in the transcript, which then shows up in the sentiment analysis reports. How can I improve the accuracy of the native PII redaction engine to ensure that all sensitive German address formats are caught before they are stored in our analytics database?

Hello Yui14! I manage our APAC region and we deal with many different address formats too. To improve the redaction accuracy, you should look at the ‘Dialect’ settings in your speech-to-text engine. If you are using the standard ‘German’ dialect, it might not be trained on specific regional address patterns. You should also ensure that your ‘Confidence Threshold’ for redaction is set correctly. If it is too high, the engine will only redact things it is one hundred percent sure about, which leads to the misses you are seeing. I have a detailed list of the best settings for EU compliance if you want to see them!

Hello everyone! I am a migration specialist and I love working on these compliance problems! Yui14, if the native engine is missing things, you might want to consider using a ‘Pre-Processing’ step. You can use a Data Action to send your transcripts to an external NLP engine (like Amazon Comprehend) that has better support for German PII. It is a bit more complex to set up, but it gives you an extra layer of security for your GDPR audits!