{"id":659,"date":"2021-05-11T08:30:00","date_gmt":"2021-05-11T12:30:00","guid":{"rendered":"https:\/\/www.macloo.com\/ai\/?p=659"},"modified":"2021-05-10T11:28:41","modified_gmt":"2021-05-10T15:28:41","slug":"identifying-toxic-comments-with-ai","status":"publish","type":"post","link":"https:\/\/www.macloo.com\/ai\/2021\/05\/11\/identifying-toxic-comments-with-ai\/","title":{"rendered":"Identifying toxic comments with AI"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">The basic idea: Immediately detect and remove hateful or dangerous posts in social media and other online forums. With advances in natural language processing (NLP), identification of harmful speech becomes more accurate and more practical.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In <a rel=\"noreferrer noopener\" href=\"https:\/\/www.scientificamerican.com\/article\/can-ai-identify-toxic-online-content\/\" target=\"_blank\">this essay<\/a> published in <em>Scientific American<\/em> (2021), researchers from the private company <a rel=\"noreferrer noopener\" href=\"https:\/\/www.unitary.ai\/\" target=\"_blank\">Unitary<\/a> (see their public Detoxify code <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/unitaryai\/detoxify\" target=\"_blank\">on GitHub<\/a>) discuss the challenges in rating the level of toxicity or harmfulness in text content. One aspect is what is considered harmful: profanity is easy to detect; misinformation is complicated. Another aspect: Terms describing gender, race, or ethnicity can be used hatefully or as (non-toxic) self-description. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">(I&#8217;ve written before about machine learning used in <a rel=\"noreferrer noopener\" href=\"https:\/\/www.macloo.com\/ai\/2020\/09\/07\/comment-moderation-as-a-machine-learning-case-study\/\" target=\"_blank\">comment moderation<\/a>, which is a large concern in media companies that permit users to post comments on articles and blog posts.) <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a rel=\"noreferrer noopener\" href=\"https:\/\/jigsaw.google.com\/\" target=\"_blank\">Jigsaw<\/a>, a Google division, &#8220;released two public data sets containing over one million toxic and non-toxic comments from Wikipedia and a service called Civil Comments.&#8221; Each comment was labeled with a rating such as \u201cToxic\u201d  or \u201cVery Toxic.\u201d The data sets were used as training data in three competitions, hosted by Google, in which AI researchers could enter their trained models and see how they compared to others (and win money). The three &#8220;Jigsaw challenges&#8221; (one per year):<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><a rel=\"noreferrer noopener\" href=\"https:\/\/www.kaggle.com\/c\/jigsaw-toxic-comment-classification-challenge\" target=\"_blank\">Toxic Comment Classification Challenge<\/a> (2018)<\/li><li><a rel=\"noreferrer noopener\" href=\"https:\/\/www.kaggle.com\/c\/jigsaw-unintended-bias-in-toxicity-classification\" target=\"_blank\">Jigsaw Unintended Bias in Toxicity Classification<\/a> (2019)<\/li><li><a rel=\"noreferrer noopener\" href=\"https:\/\/www.kaggle.com\/c\/jigsaw-multilingual-toxic-comment-classification\" target=\"_blank\">Jigsaw Multilingual Toxic Comment Classification<\/a> (2020)<\/li><\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p>&#8220;We decided to take inspiration from the best Kaggle solutions and train our own algorithms with the specific intent of releasing them publicly.&#8221;<\/p><cite><em>\u2014 Unitary researchers<\/em><\/cite><\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">The Unitary researchers describe <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/unitaryai\/detoxify\" target=\"_blank\">Detoxify<\/a>, &#8220;an open-source, user-friendly comment detection library,&#8221;  which is intended &#8220;to help researchers and practitioners identify potential toxic comments.&#8221; The library includes three separate models, one for each Jigsaw challenge. These models can be fine-tuned using additional data sets.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One particular limitation pointed out by the researchers is that <strong><em>a high toxicity score does not always indicate actually toxic content<\/em><\/strong>: &#8220;As an example, the sentence &#8216;I am tired of writing this stupid essay&#8217; will give a toxicity score of 99.7 percent, while removing the word &#8216;stupid&#8217; will change the score to 0.05 percent.&#8221;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There&#8217;s still a long way to go before harmful comments and social media posts can be instantly removed from platforms. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><a rel=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/\"><img decoding=\"async\" alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https:\/\/i.creativecommons.org\/l\/by-nc-nd\/4.0\/88x31.png\"><\/a><br>\n<small><span xmlns:dct=\"http:\/\/purl.org\/dc\/terms\/\" property=\"dct:title\"><strong>AI in Media and Society<\/strong><\/span> by <span xmlns:cc=\"http:\/\/creativecommons.org\/ns#\" property=\"cc:attributionName\">Mindy McAdams<\/span> is licensed under a <a rel=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/\">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License<\/a>.<br>\nInclude the author&#8217;s name (Mindy McAdams) and a link to the original post in any reuse of this content.<\/small><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The basic idea: Immediately detect and remove hateful or dangerous posts in social media and other online forums. With advances in natural language processing (NLP), identification of harmful speech becomes more accurate and more practical. In this essay published in Scientific American (2021), researchers from the private company Unitary (see their public Detoxify code on&hellip; <a class=\"more-link\" href=\"https:\/\/www.macloo.com\/ai\/2021\/05\/11\/identifying-toxic-comments-with-ai\/\">Continue reading <span class=\"screen-reader-text\">Identifying toxic comments with AI<\/span> <span class=\"meta-nav\" aria-hidden=\"true\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[5,2],"tags":[132,134,97,135,133],"class_list":["post-659","post","type-post","status-publish","format-standard","hentry","category-machine-learning","category-nlp","tag-comments","tag-hate-speech","tag-language","tag-toxicity","tag-user-generated-content"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/posts\/659","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/comments?post=659"}],"version-history":[{"count":9,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/posts\/659\/revisions"}],"predecessor-version":[{"id":670,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/posts\/659\/revisions\/670"}],"wp:attachment":[{"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/media?parent=659"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/categories?post=659"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/tags?post=659"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}