Pythian develops prototype for toxic sentiment detection for online game developer

Toxic conversations between players is a widespread issue in the online gaming world. One of the largest developers of multiplayer online games turned to Google and Pythian for assistance with a potential solution for detecting toxicity in gameplay conversations by analyzing the audio stream.

 

Business Need

The company has existing toxicity detection models that work well with instant messaging chats. However, they wanted to expand those chat detection models into online speech conversations. A prototype was in order.
 

Solution

YouTube is a large source of gameplay that includes a lot of toxic content. Pythian developed a prototype solution which transcribes YouTube videos using the Google Cloud Speech to Text API. The prototype then runs toxicity detection models to detect fragments of the videos for toxic comments and conversation between players. To develop the solution, Pythian uncovered the optimal mode and options for Google Speech to Text API which would work the best in gameplay conversations involving many participants speaking with different accents and with a variety of emotions. Pythian was also able to optimize recognition of gaming slang. The solution dealt with the lack of labeled toxicity gameplay content by leveraging transfer learning on the publicly available toxicity content. This turned the problem into a well-known supervised learning machine learning task that was addressed with TensorFlow. And by aggregating several transcriptions of lower accuracy using different recognition engines, Pythian was able to produce more accurate toxicity sentiments.
 

Result

The company can now evaluate the transcribed results against in-house labeled data to assess the quality.
 

Technologies

Google Cloud Platform, Google Text to Speech API, Google Cloud Storage, Google Kubernetes Engine, Google Cloud Memorystore, Google StackDriver, TensorFlow