Using your face to unlock your phone is a pretty genius security protocol. But like any advanced technology, hackers and thieves are always up to the challenge, whether that’s unlocking your phone with your face while you sleep or using a photo from social media to do the same.
Like every other human biometric identification system before it (fingerprints, retina scans) there are still significant security flaws in some of the most advanced identity verification technology. Brigham Young University electrical and computer engineering decided there is a better and more secure way to use your face for restricted access.
It’s called Concurrent Two-Factor Identity Verification (C2FIV) and it requires both one’s facial identity and a specific facial motion to gain access. To set it up, a user faces a camera and records a short 1-2 second video of either a unique facial motion or a lip movement from reading a secret phrase. The video then inputs into the device, which extracts facial features and the features of the facial motion, storing them for later ID verification.
The biggest problem they are trying to solve is to make sure the identity verification process is intentional. If someone is unconscious, you can still use their finger to unlock a phone and get access to their device or you can scan their retina. You see this a lot in the movies—think of Ethan Hunt in Mission Impossible even using masks to replicate someone else’s face.
To get technical, C2FIV relies on an integrated neural network framework to learn facial features and actions concurrently. This framework models dynamic, sequential data like facial motions, where all the frames in a recording have to be considered (unlike a static photo with a figure that can be outlined).
Using this integrated neural network framework, the user’s facial features and movements are embedded and stored on a server or in an embedded device and when they later attempt to gain access, the computer compares the newly generated embedding to the stored one. That user’s ID is verified if the new and stored embeddings match at a certain threshold.
In their preliminary study, they recorded 8,000 video clips from 50 subjects making facial movements such as blinking, dropping their jaw, smiling, or raising their eyebrows as well as many random facial motions to train the neural network. They then created a dataset of positive and negative pairs of facial motions and inputted higher scores for the positive pairs (those that matched). Currently, with the small dataset, the trained neural network verifies identities with over 90% accuracy. They are confident the accuracy can be much higher with a larger dataset and improvements on the network.
The idea is not to compete with Apple or have the application be all about smartphone access. In this opinion, C2FIV has broader applications, including accessing restricted areas at a workplace, online banking, ATM use, safe deposit box access, or even hotel room entry or keyless entry/access to your vehicle.