2019: The Year of the Face
By Paul Melcher, republished from Kaptur
[Scroll down for And a few more things... CES + other imaging industry news highlights];
It’s been ten years since Instagram launched and not long after, the selfie. It has taken the same amount of time for visual recognition to understand how to read our faces. If anything, 2019 has been the year where faces have taken center stage of visual tech, for good and bad…
All of your faces are belong to us
The most frequent usage of our faces is facial recognition for surveillance and security. It is has been the most visible because it is the most controversial. Fueled by a competitive market made of a combination of state-owned agencies and private enterprises, facial recognition has been at the forefront of commercial image recognition services. Scores of small, medium, and large companies compete to grab shares of what is considered to become a $7 to $15 billion market by 2024.
Like with all the A.I., these companies disregard ethics in favor of profits. A client is always offered a solution, regardless of who they are and what usage they intend. The only criteria is their ability to pay.
There are three components to facial recognition: the algorithm, the teaching pool, and the index.
- Facial recognition algorithms are probably the most mature of the visual recognition space. Mostly because they have been studied for a long time, do not need to consider context, and, with no offense to anyone, a face is a rather easy object to recognize. Today’s algorithms take into account anywhere from 80 to 500 data points on a face ( like distance between eyes and width of the nose) to create a unique digital fingerprint of a person.
- The teaching pool is the images you used to train your algorithm to recognize a person. The more images of one person you have, the better it is at identifying them. Countries or companies with more forceful leadership have the most complete data pool.
- The index is what allows you to take a new image, compare it with the content of your data pool, and retrieve a match. Probably the simplest and most straightforward part of the process.
Imagga, one of the vendors of face recognition solutions
Clearly, getting the most extensive data pool is key to success, and this is where all efforts are being made. In countries with low respect for individual liberties like China, it is an almost complete categorization of the population for policing purposes. Others use a more subtle approach, like the USA, which starts with felons, expands to travelers, and might one day even access our selfies to create its own data pool. The result, however, will be the same.
Soon our faces will be part of databases whether we like it or not. It will be used partly for policing, partly for marketing. Either way, we will have little or nothing to say about how we are being tracked.
Of course, there are positives, like instant identification and personification. Financial companies see facial recognition as a potent, unique identifier to access accounts and transactions, and phone manufacturers use it to let people protect their information. It could soon be used in cars and homes, in replacement of keys, and to trigger personal comfort settings.
The greatest challenge is, as with any new technology, the lack of appropriate legislation. Without a proper debate on how, where, and when facial recognition is used, it is bound to slide into abusive and destructive applications quickly.
It is not real, or is it?
Deepfakes have exploded in 2019. Not so much in volumes as yet another source of concern brought by technology. And while anything can be replicated via synthetic data, deepfakes are mostly replacing faces: Those of celebrities put on porn stars or faces of politicians made to say or do disturbing things.
As with facial recognition, ethical barriers are no match to the destructive intent. And while our attention is entirely focused on deepfake faces, the real damage will come from unsuspected content with probably no or unrecognizable faces involved. As long as deepfakes play in the realm of famous people, movie stars, and politicians, they will easily be identifiable: Those involved will certainly loudly report the deception.
Example of a deepfake video using an SNL sketch video and Hillary Clinton’s real face.
Buying a new face:
2019 was also the breakthrough in GAN faces. The launch of the website thispersondoesnotexist.com was a revelation to many. Before, it took a human, some photoshop skills, and some time to generate a photo of someone who didn’t exist. It was a creative process. With generative adversarial networks, or GAN, not anymore. An unsupervised computer can get to the same result in milliseconds and with incredible accuracy.
Recognizing faces is one of mankind’s most expert attribute. We are hardwired to recognize faces. It is what babies recognize first, the faces of mom and pop. It is how we navigate in society, quickly identifying genders, age, origin, intent, social status, and many other non-verbal clues.
Generated.photos, 100,000 faces of people who have never existed.
The ability for an algorithm to generate all by itself, with no human supervision, a face that humans recognizes as real is thus an enormous leap. Already available as free stock photos for anyone to use ( fake avatars anyone ?), computer-generated faces will soon completely replace real models everywhere they are used in photos. Next, it will be videos. And not far away, you will see deepfakes with perfectly reproduced faces along with generated voices of people who never, ever existed.
GAN generated faces might become, quite ironically, one of the most vigorous defenses against abuses in facial recognition technology. Using fake faces, one could create double or triple parallel identities, just to navigate identification barriers.
There is no doubt what has emerged in 2019 will continue through 2020 and beyond. The want to control combined with the pleasure of being recognized will undoubtedly lead to making our faces a central point of our visual technology. It will help us open doors, suggest foods or clothes while at the same time quickly separate the good guys from the bad. The question is, who decides who and what is bad?
Opening image by Photo by Sharon McCutcheon from Pexels
Paul Melcher is the founder of Kaptur and Managing Director of Melcher System, a consultancy for visual technology firms. He is an entrepreneur, advisor, and consultant with a rich background in visual tech, content licensing, business strategy, and technology with more than 20 years experience in developing world-renowned photo-based companies with already two successful exits.
And a few more things...
Snapchat & AI Factory. Going once, going twice… Snapchat quietly acquired AI Factory, the company behind its new Cameos feature, reportedly for $166M. AI Factory was founded by Victor Shaburov, previous founder of Looksery, who introduced Looksery in 2014 at Visual 1st (still called Mobile Photo Connect then) before it was acquired by Snapchat in 2015 reportedly for $150M.
Circle Graphics & Metromedia Technologies. Circle Graphics announced it acquired out-of-home advertising rival Metromedia Technologies (MMT). Terms weren’t disclosed. Founded in 1987, MMT pioneered digital printing for outdoor advertising, by effectively replacing hand-painted billboards.
Moonpig. Moonpig, a division of UK-based Photobox and online seller of greeting cards hit the £100M ($130M) in sales mark. Moonpig delivers over 16 million cards a year.
Vivo, Oppo, Xiaomi. Camera vendors with ill-received camera-to-phone transfer solutions: take note! 3 Chinese phone vendors have formed an alliance to enable their users to transfer files between their mobile devices with the speed and ease of use along the lines of Apple’s AirDrop – without the need to download third-party apps or consume network data. The data transfer capability taps Bluetooth and Wi-Fi peer-to-peer (P2P), or Wi-Fi Direct, and touts an average transfer speed of 20Mbps. It does not cause disruption of the devices’ internet WiFi connections. Now I just want my camera to talk with my phone this way!
***Fresh from my trip to CES*** (there’s too much to report on, including new cameras announced by Canon and Nikon, and new phones by Samsung, but here a few interesting new products that jumped out at me:
Quibi. The question I always get: What was the most exciting thing at CES? This year, I vote for a service rather than a physical or software product, and one which I believe will fundamentally disrupt how consumers will watch professional video content. At CES Meg Whitman and Jeffrey Katzenberg unveiled the details of their new startup, Quibi. What is Quibi? Simply put, it’s Netflix for short-form, mobile-only, video. Quibi will offer on-the-go bite-sized professional movies (or movies in bite-size chapters), documentaries and daily shows of up to 10 minutes.
Why mobile-only? Quibi’s content is created from scratch for smartphone watching, having the opportunity to creatively leverage any or all of the phone’s time clock, gyroscope, light measurement, camera or touch screen functions. Most impressively, all content is shot and edited separately for how the user holds their phone. When the user rotates their phone from portrait to landscape or v.v. the video automatically continues with footage optimized for viewing the video that way.
With Katzenberg’s Hollywood movie industry and Whitman’s Silicon Valley tech industry experience, combined with a laser-sharp focus on producing bite-sized premium content for smartphones, Quibi is slated to succeed in fundamentally disrupting how we’ll all view professionally created video content in the future. Oh, and did I mention they’ve already raised $1.4 billion and sold out their first 12 months of advertising inventory?
Insta360 & Leica. Going modular. 2018 Visual 1st Best of Show Award winner Insta360 announced One R, a modular camera system that uses Insta360’s video app and gives the user the option to plug in the Insta360 360-degree camera, their GoPro-style action camera or a new wide-angle module with a 1-inch sensor developed in partnership with Leica.
PowerVision. DJI, take note! PowerVision launched PowerEgg X, an AI-powered drone camera that can also be used as a handheld 3-axis gimbal camera or as a personal AI camera. According to the company it took 3 years and over 300 engineers to develop PowerEgg X and the product covers over 100 patents.
Bosch. Fast forward to the future with a great example of what we call heuristic imaging at Visual 1st – i.e. intelligent machines that make autonomous decisions based on visual data. At CES Bosch introduced the Virtual Visor, a transparent LCD screen paired with a driver-focused camera used to track the sun shining onto the driver's face. The system employs AI to locate facial features (including eyes, mouth and nose) in order to track shadows as they move across the driver's face. A patented algorithm is then used to pinpoint where the driver's eyes are and selectively block (darken) and unblock sections of the Virtual Visor in real time to prevent the driver looking in the sun. Virtual Visor’s LCD panel darkens only the portion of the visor that the driver is looking at - the rest of the panel remains transparent, opening up a larger field of vision for the driver while blocking the sun.
Bellus 3D. Bellus 3D, last year’s Visual 1st Best of Show winner, is expanding beyond handheld smartphone scanning. At CES the company launched Bellus3D ARC, a multi-camera 3D face scanning solution that captures commercial grade, full 3D face scans through the click of a single button in less than 3 seconds. Bellus3D ARC is a configurable arrangement of up to seven ARC smart depth-sensing WiFI cameras with no moving parts and requires no movement by the subject.
Visual 1st 2020. Mark your calendar:Visual 1st 2020 will be Oct. 14-15, San Francisco.
Click here to automatically receive Visual 1st Perspectives in your mailbox.
Best,
Hans Hartman