{"id":200,"date":"2020-08-25T10:13:04","date_gmt":"2020-08-25T14:13:04","guid":{"rendered":"https:\/\/www.macloo.com\/ai\/?p=200"},"modified":"2020-08-28T10:10:21","modified_gmt":"2020-08-28T14:10:21","slug":"face-detection-without-a-deep-neural-network","status":"publish","type":"post","link":"https:\/\/www.macloo.com\/ai\/2020\/08\/25\/face-detection-without-a-deep-neural-network\/","title":{"rendered":"Face detection without a deep neural network"},"content":{"rendered":"\n<p>I was surprised when I watched this video about how most face detection works. Granted, this is <em>not<\/em> face recognition (identifying the specific person). Face detection looks at an image or video and can almost instantly point out all the human faces. In a consumer camera, this is part of the code that  puts a rectangle around each person&#8217;s face while you&#8217;re framing your shot.<\/p>\n\n\n\n<p>What&#8217;s wonderful in the video is how the Viola\u2013Jones object detection framework is illustrated and explained so that even we non-math types can understand it.<\/p>\n\n\n\n<figure class=\"wp-block-embed-youtube aligncenter wp-block-embed is-type-video is-provider-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"jetpack-video-wrapper\"><iframe loading=\"lazy\" title=\"Detecting Faces (Viola Jones Algorithm) - Computerphile\" width=\"739\" height=\"416\" src=\"https:\/\/www.youtube.com\/embed\/uEJ71VlUmMQ?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe><\/div>\n<\/div><\/figure>\n\n\n\n<p>Like the game cases <a href=\"https:\/\/www.macloo.com\/ai\/2020\/08\/24\/what-is-called-ai-but-really-isnt\/\">I wrote about yesterday<\/a>, this is a case where tried-and-true algorithms are used, but deep neural networks are not.<\/p>\n\n\n\n<p>As is typical with AI, there is a model. How does the code identify a human face? It &#8220;knows&#8221; some things about the shape and proportions of human faces. But it knows these attributes (features) not as noses and eyes and mouths \u2014 as we humans do. Instead, it knows them as rectangular shapes that map very well to the pixels in a digital image.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"932\" src=\"https:\/\/www.macloo.com\/ai\/wp-content\/uploads\/2020\/08\/viola_jones_features.png\" alt=\"\" class=\"wp-image-204\" srcset=\"https:\/\/www.macloo.com\/ai\/wp-content\/uploads\/2020\/08\/viola_jones_features.png 1024w, https:\/\/www.macloo.com\/ai\/wp-content\/uploads\/2020\/08\/viola_jones_features-300x273.png 300w, https:\/\/www.macloo.com\/ai\/wp-content\/uploads\/2020\/08\/viola_jones_features-768x699.png 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption><em>Above: Graphic from Viola and Jones (2001) \u2014 <a rel=\"noreferrer noopener\" href=\"https:\/\/www.cs.cmu.edu\/~efros\/courses\/LBMV07\/Papers\/viola-cvpr-01.pdf\" target=\"_blank\">PDF<\/a><\/em><\/figcaption><\/figure>\n\n\n\n<p>Make sure you stay with the video until 3:30, when <a rel=\"noreferrer noopener\" href=\"https:\/\/www.nottingham.ac.uk\/research\/inspiring-people\/fellows\/michael-pound.aspx\" target=\"_blank\">Mike Pound<\/a> begins to draw on paper. (This drawing-by-hand is a large part of why I love the videos from <a rel=\"noreferrer noopener\" href=\"https:\/\/www.youtube.com\/channel\/UC9-y-6csu5WGm29I7JiwpnA\" target=\"_blank\">Computerphile<\/a>!)  At 8:30 he begins drawing a face to show how the algorithm analyzes that segment of an image.<\/p>\n\n\n\n<p>The one part that might not be clear (depending on how much time you spend thinking about <em>pixels in images<\/em>) is that the numbers in the grid he draws represent <em>values of lightness or darkness<\/em> in the image. In all cases, computers require knowledge to be represented as numbers. When dealing with images, numbers represent differences. To compare sections of an image with other sections, the numeric values for one section are added up and compared with the sum of numeric values from another section.<\/p>\n\n\n\n<p>The animations in the final three minutes of the video provide an awesomely clear explanation of how the regions of the image are assessed and quickly discarded as &#8220;not a face&#8221; or retained for further examination.<\/p>\n\n\n\n<p>Computers are lightning-fast at these kinds of calculations. This method is so efficient, it runs rapidly even on simple hardware \u2014 which is why this method of face detection has been in use since 2002.<\/p>\n\n\n\n<p><a rel=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/\"><img decoding=\"async\" alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https:\/\/i.creativecommons.org\/l\/by-nc-nd\/4.0\/88x31.png\"><\/a><br>\n<small><span xmlns:dct=\"http:\/\/purl.org\/dc\/terms\/\" property=\"dct:title\"><strong>AI in Media and Society<\/strong><\/span> by <span xmlns:cc=\"http:\/\/creativecommons.org\/ns#\" property=\"cc:attributionName\">Mindy McAdams<\/span> is licensed under a <a rel=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/\">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License<\/a>.<br>\nInclude the author&#8217;s name (Mindy McAdams) and a link to the original post in any reuse of this content.<\/small><\/p>\n\n\n\n<p>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I was surprised when I watched this video about how most face detection works. Granted, this is not face recognition (identifying the specific person). Face detection looks at an image or video and can almost instantly point out all the human faces. In a consumer camera, this is part of the code that puts a&hellip; <a class=\"more-link\" href=\"https:\/\/www.macloo.com\/ai\/2020\/08\/25\/face-detection-without-a-deep-neural-network\/\">Continue reading <span class=\"screen-reader-text\">Face detection without a deep neural network<\/span> <span class=\"meta-nav\" aria-hidden=\"true\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[43,3],"tags":[21],"class_list":["post-200","post","type-post","status-publish","format-standard","hentry","category-algorithms","category-image-recognition","tag-face_recognition"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/posts\/200","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/comments?post=200"}],"version-history":[{"count":10,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/posts\/200\/revisions"}],"predecessor-version":[{"id":242,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/posts\/200\/revisions\/242"}],"wp:attachment":[{"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/media?parent=200"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/categories?post=200"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.macloo.com\/ai\/wp-json\/wp\/v2\/tags?post=200"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}