Face Recognition in Unconstrained Environment
Abstract
The goal of computer vision is to provide the ability to machines to understand image data and infer
the useful information from it. The inferences highly depend on the quality of the image data. But in
many real-world applications, we encounter poor quality images which have low discriminative power
which affects the performance of computer vision algorithms. In particular, in the field of Biometrics,
the performance of face recognition systems are significantly affected when the face images have poor
resolution and are captured under uncontrolled pose and illumination conditions as in surveillance settings.
In this thesis, we propose algorithms to match the low-resolution probe images captured under
non frontal pose and poor illumination conditions with the high-resolution gallery faces captured in
frontal pose and good illuminations which are often available during enrollment.
Many of the standard metric learning and dictionary learning approaches perform quite well in
matching faces across different domains but they require the locations of several landmark points like
corners of eyes, nose and mouth etc. both during training and testing. This is a difficult task especially
for low-resolution images under non-frontal pose. In the first algorithm of this thesis, we propose
a multi-dimensional scaling based approach to learn a common transformation matrix for the entire
face which simultaneously transforms the facial features of the low-resolution and the high-resolution
training images such that the distance between them approximates the distance had both the images been
captured under the same controlled imaging conditions. It is only during the training stage that we need
locations of different fiducial points to learn the transformation matrix. To overcome the computational
complexity of the algorithm, we further proposed a reference-based face recognition approach with a
trade-off on recognition performance.
In our second approach in this thesis, we propose a novel deep convolutional neural network architecture
to address the low-resolution face recognition by systematically introducing different kinds
of constraints at different layers of the architecture so that the approach can recognize low-resolution
images as well as generalize well to images of unseen categories.
Though coupled dictionary learning has emerged as a powerful technique for matching data samples
of cross domains, most of the frameworks demand one-to-one paired training samples. In practical
surveillance face recognition problems, there can be just one high-resolution image and many low resolution
images of each subject for training in which there is no exact one-to-one correspondence
in the images from two domains. The third algorithm proposes an orthogonal dictionary learning and
alignment approach for handling this problem. In this part, we also address the heterogeneous face
recognition problem where the gallery images are captured from RGB camera and the probe images are
captured from near-infrared (NIR) camera.
We further explored the more challenging problem of low-resolution heterogeneous face recognition
where the probe faces are low-resolution NIR images since recently, NIR images are increasingly being
captured for recognizing faces in low-light/night-time conditions. We developed a re-ranking framework
to address the problem. To further encourage the research in this field, we have also collected
our own database HPR (Heterogeneous face recognition across Pose and Resolution) which has facial
images captured from two surveillance quality NIR cameras and one high-resolution visible camera,
with significant variations in head pose and resolution. Extensive related experiments are conducted on
each of the proposed approaches to demonstrate their effectiveness and usefulness