I am the Speech Analytics manager and I am completely fed up with our topic detection accuracy! We are seeing a massive spike in ‘Billing Inquiry’ topic hits, but when I listen to the calls, the customer is actually talking about a technical issue. It turns out the engine is triggering on the word ‘pay’ even when it is part of a different phrase or just background noise. I have tried adjusting the confidence threshold, but now it is missing actual billing calls! Is there a way to exclude specific phrases or set a ‘Negative Keyword’ list for topic detection, or am I stuck with these garbage metrics until the next engine update?
I understand the technical challenge. In my workforce planning models, inaccurate topic tagging ruins our long-term forecasts for specialized queues. You cannot set a simple ‘Negative Keyword’ list in the standard Topic Workbench.
Instead, you must refine your ‘Phrases’ within the topic definition. You should utilize the ‘Must Not Include’ logic if your specific engine version supports it.
Alternatively, you must increase the number of required phrases or use proximity operators like NEAR to ensure the word ‘pay’ is actually related to a billing context before the topic registers a hit.
Hello! I create many training modules for Speech Analytics tuning. One thing I always tell my students is that topic detection is only as good as the transcript quality. If background noise is causing false positives, you should check your ‘Speech-to-Text’ settings first.
Sometimes, using a different dialect model can reduce the noise-to-text conversion errors. Also, you should try to use more specific phrases.
Instead of just ‘pay’, use ‘I want to pay’ or ‘my payment failed’. It is much better to have a few more specific phrases than one very broad word that hits everything! I hope this helps you get cleaner data!