Deep Learning for Iris Recognition: A Survey
Abstract.
ABSTRACT
In this survey, we provide a comprehensive review of more than 200 papers, technical reports, and GitHub repositories published over the last 10 years on the recent developments of deep learning techniques for iris recognition, covering broad topics on algorithm designs, open-source tools, open challenges, and emerging research. First, we conduct a comprehensive analysis of deep learning techniques developed for two main sub-tasks in iris biometrics: segmentation and recognition. Second, we focus on deep learning techniques for the robustness of iris recognition systems against presentation attacks and via human-machine pairing. Third, we delve deep into deep learning techniques for forensic application, especially in post-mortem iris recognition. Fourth, we review open-source resources and tools in deep learning techniques for iris recognition. Finally, we highlight the technical challenges, emerging research trends, and outlook for the future of deep learning in iris recognition.
1. Introduction
The human iris is a sight organ that controls the amount of light reaching the retina, by changing the size of the pupil. The texture of the iris is fully developed before birth, its minutiae do not depend on genotype, it stays relatively stable across lifetime (except for disease- and normal aging-related biological changes), and it may even be used for forensic identification shortly after subject’s death (Muron and Pospisil, 2000; Daugman, 2016; Trokielewicz et al., 2019).
In terms of its information theory-related properties, the iris texture has an extremely high randotypic randomness, and is stable (permanent) over time, providing an exceptionally high entropy per mm.2 that justifies its higher discriminating power, when compared to other biometric modalities (e.g., face or fingerprint). The iris’ collectability is another feature of interest and has been the subject of discussion over the last years: while it can be acquired using commercial off-the-shelf (COTS) hardware, either handheld or stationary, data can be even collected from at-a-distance, up to tens of meters away from the subjects (Nguyen et al., 2017a). Even though commercial visible-light (RGB) cameras are able to image the iris, the near infrared-based (NIR) sensing dominates in most applications, due to a better visibility of iris texture for darker eyes, rich in melanin pigment, which is characterized by lower light absorption in NIR spectrum compared to shorter wavelengths. In addition, NIR wavelengths are barely perceivable by the human eye, which augment users’ comfort, and avoids pupil contraction/dilation that would appear under visible light.
A seminal work by John Daugman brought to the community the Gabor filtering-based approach that became the dominant approach for iris recognition (Daugman, 1993, 2007, 2021). Even though subsequent solutions to iris image encoding and matching appeared, the IrisCodes approach is still dominant due to its ability to effectively search in massive databases with a minimal probability of false matches, at extreme time performance. By considering binary words, pairs of signatures are matched using XOR parallel-bit logic at lightening speed, enabling millions of comparisons/second per processing core. Also, most of the methods that outperformed the original techniques in terms of effectiveness do not work under the one-shot learning paradigm, assume multiple observations of each class to obtain appropriate decision boundaries, and - most importantly - have encoding/matching steps with time complexity that forbid their use in large environments (in particular, for all-against-all settings).
In short, Daugman’s algorithm encodes the iris image into a binary sequence of 2,048 bits by filtering the iris image with a family of Gabor kernels. The varying pupil size is rectified by the Cartesian-to-polar coordinate system transformation, to end up with an image representation of canonical size, guarantying identical structure of the iris code independently of the iris and pupil size. This makes possible to use the Hamming Distance (HD) to measure the similarity between two iris codes (Daugman, 2021). Its low false match rate at acceptable false non-match rates is the key factor behind the success of global-scale iris recognition installments, such as the national person identification and border security program Aadhaar program in India (with over 1.2 billion pairs of irises enrolled) (Unique Identification Authority of India, 2021), the Homeland Advanced Recognition Technology (HART) in the US (up to 500 million identities) (Planet Biometrics, 2017), or the NEXUS system, designed to speed up border crossings for low-risk and pre-approved travelers moving between Canada and the US.
Deep learning-based methods, in particular using various Convolutional Neural Network architectures, have been driving remarkable improvements in many computer vision applications over the last decade. In terms of biometrics technologies, it’s not surprising that iris recognition has also seen an increasing adoption of purely data-driven approaches at all stages of the recognition pipeline: from preprocessing (such as off-axis gaze correction), segmentation, encoding to matching. Interestingly, however, the impact of deep learning on the various stages of iris recognition pipeline is uneven. One of the primary goals of this survey paper is to assess where deep learning helped in achieving highly performance and more secure systems, and which procedures did not benefit from more complex modeling.
The remainder of the paper is structured as follows. Section 2 and 3 review the application of deep learning in two main stages of the recognition pipeline: segmentation and recognition (encoding and comparison). Section 4 and 5 analyze the state of the art of deep learning-based approaches in two applications: Presentation Attack Detection (PAD) and Forensic. Section 6 investigates how human and machine can pair to improve deep learning based iris recognition. Section 7 focuses on approaches in less controlled environments of iris and periocular analysis. Section 8 reviews public resources and tools available in the deep learning based iris recognition domain. Section 9 focuses on the future of deep learning for iris recognition with discussion on emerging research directions in different aspects of iris analysis. The paper in concluded in Section 10.
2. Deep Learning-Based Iris Segmentation
The segmentation of the iris is seen as an extremely challenging problem. As illustrated in Fig. 1, segmenting the iris involves essentially three tasks: detect and parameterize the inner (pupillary) and outer (scleric) biological boundaries of the iris and also to locally discriminate between the noise-free/noisy regions inside the iris ring, which should be subsequently used in the feature encoding and matching processes.
This problem has motivated numerous research works for decades. From the pioneering integro-differential operator (Daugman, 1993) up to subsequent handcrafted techniques based in active contours and neural networks (e.g., (He et al., 2009), (Proença, 2010), (Shah and Ross, 2009) and (Vatsa et al., 2008)) a long road has been traveled in this problem. Regardless an obvious evolution in the effectiveness of such techniques, they all face particular difficulties in case of heavily degraded data. Images are frequently motion-blurred, poor focused, partially occluded and off-angle. Additionally, in case of visible light data, severe reflections from the environments surrounding the subjects are visible, and even augment the difficulties of the segmentation task.
Recently, as in many other computer vision tasks, DL-based frameworks have been advocated as providing consistent advances over the state-of-the-art for the iris segmentation problem, with numerous models being proposed. A cohesive perspective of the most relevant recent DL-based methods is given in Table 1, with the techniques appearing in chronographic (and then alphabetical) order. The type of data each model aims to handle is given in the ”Data” column, along with the datasets where the corresponding experiments were carried out and a summary of the main characteristics of each proposal (”Features” column). Here, considering that models were empirically validated in completely heterogeneous ways and using very different metrics, we decided not to include the summary performance of each model/solution.
Schlett et al. (Schlett et al., 2018) provided a multi-spectral analysis to improve iris segmentation accuracy in visible wavelengths by preprocessing data before the actual segmentation phase, extracting multiple spectral components in form of RGB color channels. Even though this approach does propose a DL-based framework, the different versions of the input could be easily used to feed DL-based models, and augment the robustness to non-ideal data. Chen et al. (Chen et al., 2019a) used CNNs that include dense blocks, referred to as a dense-fully convolutional network (DFCN), where the encoder part consists of dense blocks, and the decoder counterpart obtains the segmentation masks via transpose convolutions. Hofbauer et al. (Hofbauer et al., 2019) parameterize the iris boundaries based on segmentation maps yielding from a CNN, using a a cascaded architecture with four RefineNet units, each directly connecting to one Residual net. Huynh et al. (Huynh et al., 2019) discriminate between three distinct eye regions with a DL model, and removes incorrect areas with heuristic filters. The proposed architecture is based on the encoder-decoder model, with depth-wise convolutions used to reduce the computational cost. Roughly at the same time, Li et al. (Li et al., 2021) described the Interleaved Residual U-Net model for semantic segmentation and iris mask synthesis. In this work, unsupervised techniques (K-means clustering) were used to create intermediary pictorial representations of the ocular region, from where saliency points deemed to belong to the iris boundaries were found. Kerrigan et al. (Kerrigan et al., 2019) assessed the performance of four different convolutional architectures designed for semantic segmentation. Two of these models were based in dilated convolutions, as proposed by Yu and Koltun (Y. and K., 2016). Wu and Zhao (Wu and Zhao, 2019) described the Dense U-Net model, that combines dense layers to the U-Net network. The idea is to take advantage of the reduced set of parameters of the dense U-Net, while keeping the semantic segmentation capabilities of U-Net. The proposed model integrates dense connectivity into U-Net contraction and expansion paths. Compared with traditional CNNs, this model is claimed to reduce learning redundancy and enhance information flow, while keeping controlled the number of parameters of the model. Wei et al. (Zhang et al., 2019) suggested to perform dilated convolutions, which is claimed to obtain more consistent global features. In this setting, convolutional kernels are not continuous, with zero-values being artificially inserted between each non-zero position, increasing the receptive field without augmenting the number of parameters of the model.
Method | Year | Data | Datasets | Features | |
NIR | VW | ||||
Schlett et al. (Schlett et al., 2018) | 2018 | ✗ | ✓ | MobBIO | Preprocessing (combines different possibilities of the input RGB channels) |
Trokielewicz and Czajka (Trokielewicz and Czajka, 2018) | 2018 | ✓ | ✓ | Warsaw-Post-Mortem v1.0 | Fine-tuned CNN (SegNet) |
Chen et al. (Chen et al., 2019a) | 2019 | ✓ | ✓ | CASIA-Irisv4-Interval, IITD, UBIRIS.v2 | Dense CNN |
Hofbauer et al. (Hofbauer et al., 2019) | 2019 | ✓ | ✗ | IITD, CASIA-Irisv4-Interval, ND-Iris-0405 | Cascaded architecture of four RefineNet, each connecting to one Residual net |
Huynh et al. (Huynh et al., 2019) | 2019 | ✓ | ✗ | OpenEDS | MobileNetV2 + heuristic filtering postproc. |
Li et al. (Anisetti et al., 2019) | 2019 | ✓ | ✗ | CASIA-Iris-Thousand | Faster-R-CNN (ROI detection) |
Kerrigan et al. (Kerrigan et al., 2019) | 2019 | ✓ | ✓ | CASIA-Irisv4-Interval, BioSec, ND-Iris-0405, UBIRIS.v2, Warsaw-Post-Mortem v2.0, ND-TWINS-2009-2010 | Resent + Segnet (with dilated convolutions) |
Wu and Zhao (Wu and Zhao, 2019) | 2019 | ✓ | ✓ | CASIA-Irisv4-Interval, UBIRIS.v2 | Dense-U-Net (dense layers + U-Net) |
Wei et al. (Zhang et al., 2019) | 2019 | ✓ | ✓ | CASIA-Iris4-Interval, ND-IRIS-0405, UBIRIS.v2 | U-Net with dilated convolutions |
Fang and Czajka (Fang and Czajka, 2020) | 2020 | ✓ | ✓ | ND-Iris-0405, CASIA, BATH, BioSec, UBIRIS, Warsaw-Post-Mortem v1.0 & v2.0 | Fine-tuned CC-Net (Mishra et al., 2019) |
Ganeva and Myasnikov (Ganeeva and Myasnikov, 2020) | 2020 | ✓ | ✗ | MMU | U-Net, LinkNet, and FC-DenseNet (performance comparison) |
Jalilian et al. (Jalilian et al., 2020) | 2020 | ✓ | ✗ | RefineNet + morphological postprocessing | |
Sardar et al. (Sardar et al., 2020) | 2020 | ✓ | ✓ | CASIA-Irisv4-Interval, IITD, UBIRIS.v2 | Squeeze-Expand module + active learning (interactive segmentation) |
Trokielewicz et al. (Trokielewicz et al., 2020) | 2020 | ✓ | ✓ | ND-Iris-0405, CASIA, BATH, BioSec, UBIRIS, Warsaw-Post-Mortem v1.0 & v2.0 | Fined-tuned SegNet (Badrinarayanan et al., 2017) |
Wang et al (Wang et al., 2020b) | 2020 | ✓ | ✓ | CASIA-Iris-M1-S1/S2/S3, MICHE-I | Hourglass network |
Wang et al. (Wang et al., 2020a) | 2020 | ✓ | ✓ | CASIA-v4-Distance, UBIRIS.v2, MICHE-I | U-Net + multi-task attention net + postproc. (probabilistic masks priors & thresholding) |
Li et al. (Li et al., 2021) | 2021 | ✓ | ✗ | CASIA-Iris-Thousand | IRU-Net network |
Wang et al. (Wang et al., 2021) | 2021 | ✗ | ✓ | Online Video Streams and Internet Videos | U-Net and Squeezenet to iris segmentation and detect eye closure |
Kuehlkamp et al. (Kuehlkamp et al., 2022) | 2022 | ✓ | ✓ | ND-Iris-0405, CASIA, BATH, BioSec, UBIRIS, Warsaw-Post-Mortem v2.0 | Fined-tuning of Mask-RCNN architecture |
More recently, Ganeva and Myasnikov (Ganeeva and Myasnikov, 2020) compared the effectiveness of three convolutional neural network architectures (U-Net, LinkNet, and FC- DenseNet), determining the optimal parameterization for each one. Jalilian et al. (Jalilian et al., 2020) introduced a scheme to compensate for texture deformations caused by the off-angle distortions, re-projecting the off-angle images back to frontal view. The used architecture is a variant of RefineNet (Lin et al., 2016), which provides high-resolution prediction, while preserving the boundary information (required for parameterization purposes).
The idea of interactive learning for iris segmentation was suggested by Sardar et al. (Sardar et al., 2020), describing an interactive variant of U-Net that includes Squeeze Expand modules. Trokielewicz et al. (Trokielewicz et al., 2020) used DL-based iris segmentation models to extract highly irregular iris texture areas in post-mortem iris images. They used a pre-trained SegNet model, fine-tuned with a database of cadaver iris images. Wang et al. (Wang et al., 2020b) (further extended in (Wang et al., 2019b)) described a lightweight deep convolutional neural network specifically designed for iris segmentation of degraded images acquired by handheld devices. The proposed approach jointly obtains the segmentation mask and parameterized pupillary/limbic boundaries of the iris.
Observing that edge-based information is extremely sensitive to be obtained in degraded data, Li et al. (Anisetti et al., 2019) presented an hybrid method that combines edge-based information to deep learning frameworks. A compacted Faster R-CNN-like architecture was used to roughly detect the eye and define the initial region of interest, from where the pupil is further located using a Gaussian mixture model. Wang et al. (Wang et al., 2021) trained a deep convolutional neural network(DCNN) that automatically extracts the iris and pupil pixels of each eye from input images. This work combines the power of U-Net and SqueezeNet to obtain a compact CNN suitable for real time mobile applications. Finally, Wang et al. (Wang et al., 2020a) parameterize both the iris mask and the inner/outer iris boundaries jointly, by actively modeling such information into a unified multi-task network.
A final word is given to segmentation-less techniques. Assuming that the accurate segmentation of the iris boundaries is one of the hardest phases of the whole recognition chain and the main source for recognition errors, some recent works have been proposing to perform biometrics recognition in non-segmented or roughly segmented data (Proença and Neves, 2017)(Proença and Neves, 2019). Here, the idea is to use the remarkable discriminating power of DL-frameworks to perceive the agreeing patterns between pairs of images, even on such segmentation-less representations.
3. Deep Learning-Based Iris Recognition
3.1. Deep Learning Models as a Feature Extractor
As illustrated in Fig. 2, the idea here is to analyze a dimensionless representation of the iris data and produce a feature vector that lies in a hyperspace (embedding) where recognition is carried out.
In this context, Boyd et el. (Boyd et al., 2019) explored five different sets of weights for the popular ResNet50 architecture to test if iris-specific feature extractors perform better than models trained for general tasks. Minaee et al. (Minaee et al., 2016) studied the application of deep features extracted from VGG-Net for iris recognition, having authors observed that the resulting features can be well transferred to biometric recognition. Luo et al. (Luo et al., 2021) described a DL model with spatial attention and channel attention mechanisms, that are directly inserted into the feature extraction module. Also, a co-attention mechanism adaptively fuses features to obtain representative iris-periocular features. Hafner et al. (Hafner et al., 2021) adapted the classical Daugman’s pipeline, using convolutional neural networks to function as feature extractors. The DenseNet-201 architecture outperformed its competitors achieving state-of- the-art results both in the open and close world settings. Menotti et al. (Menotti et al., 2015) assessed how DL-based feature representations can be used in spoofing detection, observing that spoofing detection systems based on CNNs can be robust to attacks already known and adapted, with little effort, to image-based attacks that are yet to come.
Yang et al. (Yang et al., 2021) generated multi-level spatially corresponding feature representations by an encoder-decoder structure. Also, a spatial attention feature fusion module was used to ensemble the resulting features more effectively. Chen et al. (Chen et al., 2020) addressed the large-scale recognition problem and described an optimized center loss function (tight center) to attenuate the insufficient discriminating power of the cross-entropy function. Nguyen et al. (Nguyen et al., 2017b) explored the performance of state-of-the-art pre-trained CNNs on iris recognition, concluding that off-the-shelf CNN generic features are also extremely good at representing iris images, effectively extracting discriminative visual features and achieving promising results. Zhao et al. (Zhao et al., 2019) proposed a method based on the capsule network architecture, where a modified routing algorithm based on the dynamic routing between two capsule layers was described, with three pre-trained models (VGG16, InceptionV3, and ResNet50) extracting the primary iris features. Next, a convolution capsule replaces the full connection capsule to reduce the number of parameters. Wang and Kumar (Wang and Kumar, 2019) introduced the concept of residual feature for iris recognition. They described a residual network learning procedure with offline triplets selection and dilated convolutional kernels.
Other works have addressed the extraction of appropriate feature representations in multi-biometrics settings: Damer et al. (Damer et al., 2019) propose to jointly extract multi-biometric representations within a single DNN. Unlike previous solutions that create independent representations from each biometric modality, they create these representations from multi-modality (face and iris), multi-instance (iris left and right), and multi- presentation (two face samples), which can be seen as a fusion at the data level policy. Finally, concerned about the difficulty of performing reliable recognition in hand-held devices, Odinokikh et al. (Odinokikh et al., 2019) combined the advantages of handcrafted feature extractors and advanced deep learning techniques. The model utilizes shallow and deep feature representations in combination with characteristics describing the environment, to reduce the intra-subject variations expected in this kind of environments.
3.2. Deep Learning-based Iris Matching Strategies
The existing matching strategies can be categorized into three categories: (1) using conventional classifiers, such as SVM, RF, and Sparse Representation; (2) softmax-based losses; and (3) pairwise-based losses. A cohesive perspective of the most relevant recent DL-based methods is given in Table 2, with the techniques appearing in chronographic (and then alphabetical) order
3.2.1. Conventional classifiers
Various researchers have been using deep learning networks designed and pre-trained on the ImageNet dataset to extract iris feature representations, followed by a conventional classifier such as SVM, RF, Sparse Representation, etc. (Nguyen et al., 2017b; Boyd et al., 2019; Boyd et al., 2020b). The key benefit of these approaches is the simplicity of “plug and play”, where proven and pre-trained deep learning networks inherited from large-scale computer vision challenges are widely available and ready to be used (Nguyen et al., 2017b). Another benefit is that there is no need for large scale iris image datasets to train these networks because they have already been trained on such large-scale datasets as ImageNet. Considering these networks usually contain hundreds of layers and millions of parameters, and require millions of images to train, using pre-trained networks is extremely beneficial.
3.2.2. Iris Classification Networks
Iris classification networks couple deep learning architectures with a family of softmax-based losses to classify an iris image into a list of known identities. Coupling a softmax loss with a backbone network enables training the backbone network in an end-to-end manner via popular optimization strategies such as back-propagation and steepest gradient decent. Compared to the conventional classifier approaches, the DL-based backbones in this category are learnable directly from the iris data, allowing them to better represent the iris. The key benefit is that it is similar to a generic image classification task, hence all designs and algorithms in the generic image classification task can be trivially applied with the iris image data. Typical examples of these iris classification networks are (Gangwar and Joshi, 2016; Boyd et al., 2019). However, these softmax-based networks require the iris in the test image be known in the identity classes in the training set, which means the networks must be re-trained whenever a new class (i.e. a new identity) is added. Gangwar et al. proposed two backbone networks (i.e. DeepIrisNet-A and DeepIrisNet-B) followed by a softmax loss for the iris recognition task (Gangwar and Joshi, 2016). Later, they proposed another backbone network, but still followed by a softmax loss to classify one normalized iris image into a pre-defined list of identity (Gangwar et al., 2019).
Backbone Network Architectures: A wide range of backbone network architectures have been borrowed from generic image classification for the iris recognition task due to their similarity.
3.2.3. Iris Similarity Networks
Iris similarity networks couple deep learning architectures with a family of pairwise-based losses to learn a metric representing how similar or dissimilar two iris images are without knowing their identities. The pairwise loss aims to pull images of the same iris closer and push images of different irises away in the similarity distance space. Different to the iris classification networks which only operate in an identification mode on a pre-defined identity list, iris similarity networks operate across both verification and identification modes with an open set of identities (Zhao and Kumar, 2017b). Typical examples of these iris similarity networks are (Liu et al., 2016b; Zhao and Kumar, 2017b; Wang and Kumar, 2019; Nguyen et al., 2020; Jalilian et al., 2022). There are three key benefits of these networks: (i) verification and identification: iris similarity networks operate across both verification and identification modes; (ii) open set of identities: iris similarity networks operate on an open set of identities; and (iii) explicit reflection: iris similarity networks directly and explicitly reflect what we want to achieve, i.e., small distances between irises of the same subject and larger distances between irises of different subjects.
Pairwise loss: Nianfeng et al. (Liu et al., 2016b) proposed a pairwise network, which accepts two input images and directly outputs a similarity score. They designed a pairwise layer which accepts two input images and encodes their features via a backbone network. The backbone network is trained iteratively to minimize the dissimilarity distance between genuine pairs (pairs of the same identity) and maximize the dissimilarity distance between impostor pairs (pairs of the different identities).
Triplet loss: Since the pairwise network is trained with separate genuine and impostor pairs, it may not converge well, which has been proven in the face recognition (Schroff et al., 2015). Rather than using one pair of two images to update the training as in the pairwise loss for each training iteration, the triplet loss employs a triplet of three images: an anchor image, a positive image with the same identity and a negative image with a different identity (Schroff et al., 2015). The backbone network is trained to simultaneously minimize the similarity distance between the positive and the anchor images and maximize the distance between the negative and the anchor images. Tailored for iris images, Zhao et al. (Zhao and Kumar, 2017b; Wang and Kumar, 2019; Zhao and Kumar, 2019) proposed Extended Triplet Loss (EPL) to incorporate a bit-shifting operation to deal with rotation in the normalized iris images. Nguyen et al. also employed the ETL for their iris recognition network (Nguyen et al., 2020, 2022). Kuehlkamp et al. (Kuehlkamp et al., 2022) proposed to improve the generic triplet loss function for iris recognition by forcing the distance to be positive (through the use of a sigmoid output layer), and adding a logarithmic penalty to the error. This modification allows the network to learn even when the difference between samples is negative and converge faster. Yan et al. (Yan et al., 2021) extended the generic triplet loss to batch triplet loss, in which the triplet loss is calculated over a batch of subjects and images for each subject. Performing batch triplet loss is usually expected to have smooth loss function. Yang et al. (Yang et al., 2021) improved triplet selection method for training by Batch Hard (Yuan et al., 2020).
Backbone Network Architectures: Different to the classification iris networks, similarity iris networks are usually designed with their own network architectures and are usually much “shallower” than the classification counterparts.
-
•
FCN: All similarity iris networks employ Fully Convolutional Networks (FCNs) instead of CNNs. Compared to CNNs, FCNs (Long et al., 2015) do not have fully connected layer, allowing the output map to preserve the original spatial information. This is important to iris recognition since the output map can preserve spatial correspondence with the original input image (Zhao and Kumar, 2017b; Nguyen et al., 2020), thus enabling pixel-to-pixel matching. Zhao et al. (Zhao and Kumar, 2017b) proposed a FCN architecture with 3 convolutional layers, followed by activation and pooling layers. Outputs of convolutional layers are up-sampled to the original input image size. The up-samples features are stacked and convolved by another convolutional layer to generate a 2-dimension features with the same size as the input image. Later, they extended the backbone network with dilated convolutions (Wang and Kumar, 2019). Yan et al. (Yan et al., 2021) employed a ResNet architecture and fine-tuned it with the triplet loss. Kuehlkamp et al. only used a part of the ResNet architecture.
-
•
NAS: Nguyen et al. (Nguyen et al., 2020) proposed to learn the network architecture directly from data rather than hand-designing it or using generic-image-classification architectures. They proposed a differential Neural Architecture Search (NAS) approach that models the architecture design process as a bi-level constrained optimization approach. This approach is not only able to search for the optimal network which achieves the best possible performance, but it can also impose constraints on resources such as model size or number of computational operations.
-
•
Complex-valued: Observing that there is an intrinsic difference between the iris texture and generic object-based images where the iris texture is stochastic without consistent shapes, edges, or semantic structure, Nguyen et al. (Nguyen et al., 2022) argued the network architecture has to be better tailored to incorporate domain-specific knowledge in order to reach the full potential in the iris recognition setting. Another observation that they made is a majority of well-known handcrafted features such as IrisCode (Daugman, 2007) transformed iris texture image into a complex-valued representation first, then further encoded the complex-valued representation to get a final representation. They proposed to use fully complex-valued networks rather than popular real-valued networks. Complex-valued backbone networks better retain the phase, are more invariant to multi-scale, multi-resolution and multi-orientation, have solid correspondence with the classic Gabor wavelets (Tygert et al., 2016), hence are much better suited to iris recognition than their real-valued counterparts.
Category | Method | Year | Data | Datasets | Features | |
NIR | VW | |||||
Conventional classifiers | Menotti et at. (Menotti et al., 2015) | 2015 | ✓ | ✓ | Biosec, LivDet-2013-Warsaw, MobBIOfake | Shallow CNNs + SVM for Spoofing Detection |
Minaee et al. (Minaee et al., 2016) | 2016 | ✓ | ✗ | CASIA-Iris-Thousand, IITD | VGG + SVM | |
Nguyen et al. (Nguyen et al., 2017b) | 2017 | ✓ | ✗ | ND-CrossSensor-2013, CASIA-Iris-Thousand | AlexNet, VGG, Google Inception, ResNet, DenseNet + SVM | |
Boyd et al. (Boyd et al., 2019) | 2019 | ✓ | ✓ | CASIA-Irisv4-Interval, IITD, UBIRIS.v2 | ResNet50 + SVM | |
Boyd et al. (Boyd et al., 2020b) | 2020 | ✓ | ✓ | DCMEO1, Warsaw | AlexNet, ResNet, VGG, DenseNet + Cosine, Euclidean, MSE | |
Hafner et al. (Hafner et al., 2021) | 2021 | ✓ | ✗ | CASIA-Iris-Thousand | ResNet101 + DenseNet-201 + Cosine Similarity | |
Classification Networks | Gangwar et al. (Gangwar and Joshi, 2016) | 2016 | ✓ | ✗ | ND-IRIS-0405, ND-CrossSensor-2013 | DeepIrisNet |
Gangwar et al. (Gangwar et al., 2019) | 2019 | ✓ | ✓ | ND-IRIS-0405, UBIRIS.v2, MICHE-I, CASIA-Irisv4-Interval | DeepIrisNetV2 | |
Odinokikh et al. (Odinokikh et al., 2019) | 2019 | ✓ | ✗ | CASIA-Iris-M1-S2, CASIA-Iris-M1-S3, Iris-Mobile | Feature Fusion + Softmax | |
Zhao et al.(Zhao et al., 2019) | 2019 | ✓ | ✗ | JluIrisV3.1, JluIrisV4, CASIA-Irisv4-Lamp | Capsule network + Softmax | |
Chen et al. (Chen et al., 2020) | 2020 | ✓ | ✗ | ND-IRIS-0405, CASIA-Iris-Thousand, IITD cross sensor | T-Center loss | |
Luo et al.(Luo et al., 2021) | 2021 | ✓ | ✗ | ND-IRIS-0405, CASIA-Iris-Thousand | Attention + Softmax Loss + Center Loss | |
Similarity Networks | Nianfeng et al. (Liu et al., 2016b) | 2016 | ✓ | ✗ | Q-FIRE, CASIA-Cross-Sensor | DeepIris |
Zhao et al. (Zhao and Kumar, 2017b) | 2017 | ✓ | ✓ | CASIA-Irisv4-Interval, IITD, UBIRIS.v2 | UniNet (FeatNet+MaskNet) + Extended Triplet Loss | |
Damer et al. (Damer et al., 2019) | 2019 | ✓ | ✗ | Biosecure, CASIA-Iris-Thousand/Lamp/Interval | Inception + Triplet Loss | |
Wang et al. (Wang and Kumar, 2019) | 2019 | ✓ | ✓ | CASIA-Irisv4-Interval, IITD, UBIRIS.v2 | FeatNet + Dilated Convolution + Extended Triplet Loss | |
Zhao et al. (Zhao and Kumar, 2019) | 2019 | ✓ | ✗ | ND-Iris-0405, Casia-Irisv4-Distance, IITD | FeatNet + Mask RCNN + Extended Triplet Loss | |
Nguyen et al. (Nguyen et al., 2020) | 2020 | ✓ | ✓ | CASIA-v4-Distance, UBIRIS.v2, ND-CrossSensor-2013 | Constrained Design Backbone + Extended Triplet Loss | |
Yan et al. (Yan et al., 2021) | 2021 | ✓ | ✗ | CASIA-Iris-Thousand | Spatial Feature Reconstruction + Triplet Loss | |
Yang et al. (Yang et al., 2021) | 2021 | ✓ | ✗ | CASIA-Irisv4-Thousand, CASIA-Irisv4-Distance, IITD | Dual Spatial Attention Network + Batch Hard | |
Nguyen et al. (Nguyen et al., 2022) | 2022 | ✓ | ✓ | ND-CrossSensor-2013, CASIA-Iris-Thousand, UBIRIS.v2 | Complex-valued Backbone + Extended Triplet Loss | |
Kuehlkamp et al. (Kuehlkamp et al., 2022) | 2022 | ✓ | ✓ | DCMEO1, DCMEO2, Warsaw-Post-Mortem v2.0 | ResNet + Triplet Loss |
3.3. End-to-end Joint Iris Segmentation+Recognition Networks
Almost all existing approaches perform segmentation and normalization to transform an input image to a normalized rectangular 2D representation before recognition as this simplifies the representation learning. As segmentation and recognition may require a separate network themselves, this would cause redundancy in both computation and training, further slowing down an DL-based iris recognition approach. Several researchers have looked at approaches to perform end-to-end networks. One category is to perform segmentation-less recognition. Another category is to jointly learn segmentation and recognition using an unified network via multi-task learning.
Segmentation-less: These approaches feed the cropped iris images directly into a deep learning network to extract features. For example, Kuehlkamp et al. (Kuehlkamp et al., 2022) used Mask R-CNN for semantic segmentation and fed the cropped iris region directly into a ResNet50 to extract features. Similarly, Chen et al. (Chen et al., 2019b) also fed the cropped iris images directly into a DenseNet. Rather than feeding the cropped iris images directly, Proenca et al. transformed the cropped region (which is detected by SSD) into a polar representation first, then fed the polar representation into the VGG19 for extracting features (Proença and Neves, 2019).
Multi-task: Segmentation and recognition can be jointly learned with one unified network. This paves a way for multi-task learning. However, segmentation and recognition may require different number of layers, hence research is required to perform using different intermediate layers for each task. To the best knowledge, there does not exist any approach to explore this direction.
4. Deep Learning-Based Iris Presentation Attack Detection
In parallel to the popularity of biometrics, the security of these systems against attacks has become of paramount importance. The most common attack is a Presentation Attack (PA), which refers to presenting a fake sample to the sensor. The goal can be either to impersonate somebody else identity (also known as Impostor Attack Presentation), or to conceal the own identity (also known as Concealer Attack Presentation). Via impostor attacks, a person could also enroll fraudulently, allowing a continuous manipulation of the system. The previous acronyms and terms in italics correspond to the vocabulary recommended in the series of ISO/IEC 30107 standards of the ISO/IEC Subcommittee 37 (SC37) on Biometrics (technology — Biometric presentation attack detection — Part 1: Framework, 2016), which we will follow in the rest of this section. Presentation Attack Instruments (PAI) used to carry out impostor attacks are typically generated from bona fide images of an iris from an individual who has legitimate access to the system. The iris is printed on a piece of paper (printout attack) or displayed on a screen (replay attack) and then presented to the sensor. The iris of deceased individuals can also be used as PAI, since the texture remains intact for some hours (Trokielewicz et al., 2018). Theoretically, it would be possible to print a genuine iris texture into a contact lens as well, although this has not been successfully demonstrated yet (Boyd et al., 2020a). Concealer attacks, on the other hand, are commonly done via textured contact lenses that obscure or alter properties of the eye (such as color) to prevent the system from identifying the user. Synthetic iris images (Yadav et al., 2019a) not belonging to any specific identity could be used for similar purposes. Concealers can also present their legitimate iris, but in a way not expected by the system, e.g. closing eyelids as much as possible, looking to the sides (off-axis gaze), rotating the head, etc.
Two challenges of PAs is that they happen outside the physical limits of the system, and they do not require specific knowledge of its inner workings, or any technical knowledge at all. Thus, if no properly tackled, they can derail public perception of even the most reliable biometric modality. It is even more critical if authentication is done without any supervision. Presentation Attack Detection (PAD) methods to counteract such attacks can be done (Galbally and Gomez-Barrero, 2016): ) at the hardware (or sensor) level, using additional illuminators or sensors that detect intrinsic properties of a living eye or responses to external stimuli (like pupil contraction or reflection), or ) at the software level, using only the footprint of the PA (if any) left in the same images captured with the standard sensor that will be employed for authentication. Software-based techniques are in principle less expensive and intrusive, since they do not demand extra hardware, and they will be the focus of this section.
Two comprehensive surveys on PAD are (Czajka and Bowyer, 2018) (2018) and (Boyd et al., 2020a) (2020). While DL techniques were residual in the 2018 survey, they rose in popularity thereafter. We build this section upon the latest survey and summarize the most important developments in DL-PAD since it was published (Table 3). A descriptive summary of the datasets employed is given later in Section 8. The aim of PAD is to classify an image either as a bona fide or an attack presentation, so it is usually modeled as a two-class classification task. Typical strategies mimic the trend of the previous section when applying DL to iris recognition: either a CNN backbone is used to extract features that will feed a conventional classifier, or the network is trained end-to-end to do the classification itself. Some hybrid methods also combine traditional hand-crafted with deep-learned features. In the same manner, the network may be initialized e.g. on the ImageNet dataset to take advantage of such large generic corpus, since available iris PAD data is more scarce. Another strategy also employed widely in the PAD literature is to use adversarial networks, where a GAN (Goodfellow et al., 2014) is trained to generate synthetic iris images that the discriminator must use to detect attack samples.
4.1. CNNs for Feature Extraction
Since each layer of a CNN represents a different level of abstraction, Fang et al. (Fang et al., 2020a) fused the features from the last four convolutional layers of two models (VGG16, MobileNetv3-small). The features are projected to a lower dimensional space by PCA and either concatenated for classification with SVM (feature fusion) or the classification scores of each level combined (score fusion). Using two databases of printouts and textured contact lenses, the method showed superiority over the use of the different layers individually, or the feature vector from the next-to-last layer of the networks.
4.2. End-to-end Classification Networks
Arora and Bathia (Arora and Bhatia, 2020) trained a CNN with 10 convolutional layers to detect contact lenses and printouts. Rather than using the entire image, the network is trained on patches from all parts of the iris image. The system showed superior performance compared to state-of-the-art methods which at that time, according to the paper, were mostly based on hand-crafted features.
Focusing on embedded low-power devices, Peng et al. (Peng et al., 2020) adopted a Lite Anti-attack Iris Location Network (LAILNet) based on three dense blocks featuring depthwise separable convolutions to reduce the number of parameters. The algorithm demonstrated very good performance on three databases with printouts, synthetic irises, contact lenses and artificial plastic eyes.
Also focusing on mobiles, Fang et al. (Fang et al., 2021b; Fang et al., 2020b) used MobileNetv3-small. The contribution lies in the division of the normalized iris image into overlapped micro-stripes which are fed individually, and a decision reached by majority voting. The claimed advantages are that the classifier is forced to focus on the iris/sclera boundaries (given by their exact micro-stripes), the input dimensionality is lower and the amount of samples is higher (reducing overfitting), and the impact of imprecise segmentation is alleviated. Using three databases with contact lenses and printouts, the paper featured an extensive experimentation with cross-database, cross-sensor, and cross-attack setting.
Sharma and Ross (Sharma and Ross, 2020) proposed D-NetPAD, based on DenseNet121, chosen due to benefits such as maximum flow of information given by dense connections to all subsequent layers, or fewer parameters compared to counterparts like ResNet or VGG. The PAI included printouts, artificial eye, cosmetic contacts, kindle replay, and transparent dome on print, with experiments substantiating the effectiveness of the method on cross-PAI, cross-sensor and cross-database scenarios.
Chen and Ross (Chen and Ross, 2021) proposed an explainable attention-guided detector (AG-PAD). To do so, the feature maps of a DenseNet121 were fed into two modules that independently capture inter-channel and inter-spatial feature dependencies. The outputs were then fused via element-wise sum to capture complementary attention features from both channel and spatial dimensions. With three datasets containing colored contact lenses, artificial eyes (Van Dyke/Doll fake eyes), printouts, and textured contact lenses, the attention modules are shown to improve accuracy over the baseline network. Using heatmap visualization, it is also shown that the attention modules force the network to attend to the annular iris textural region which, intuitively, plays a vital role for PAD.
Spatial attention was also explored by Fang et al. (Fang et al., 2021c). To find local regions that contribute the most to make accurate decisions and capture pixel/patch-level cues, they proposed an attention-based pixel-wise binary supervision (A-PBS) method. To capture different levels of abstraction, they perform multi-scale fusion by adding spatial attention modules to feature maps from three levels of a DenseNet backbone. Using six datasets with textured lenses and printouts, they outperformed previous state-of-the-art including scenarios with unknown attacks, sensors, and databases.
Given the difficulty of collecting iris PAD data, most databases contain, at most, a few hundred subjects. To address this, Fang et al. (Fang et al., 2021d) studied data augmentation techniques that modify position, scale or illumination. Using three architectures (ResNet50, VGG16, MobileNetv3-small) and three databases with printouts and textured contact lenses, they found that data augmentation improves PAD performance significantly, but each technique has a positive role on a particular dataset or CNN. They also explored the selection of augmentation techniques, finding, again, no consensus regarding the best combination, which was attributed to differences in capture environment, subject population, scale of the different datasets or imbalance between bona fide and attack samples.
Gupta et al. (Gupta et al., 2021) proposed MVANet, with 5 convolutional layers and 3 branches of fully connected layers. They addressed the challenge of unseen databases, sensors, and imaging environment on textured contact lenses detection. The size of each layer of MVANet is different, thus capturing different features. They used three databases, each one captured in different settings (indoor/outdoor, different times of the day, varying weather, fixed/mobile sensors, etc.), with MVANET trained in one database at a time and tested on the other two. As baseline, they fine-tuned three popular CNNs (VGG16, ResNet18, DenseNet) initialized on ImageNet. The proposed network is shown to perform consistently better and more uniformly on the test databases than the baseline approaches.
Sharma and Ross (Sharma and Ross, 2021) studied the viability of Optical Coherence Tomography (OCT). OCT provides a cross-sectional view of the eye, whereas traditional NIR or VW imaging provides 2D textural data. The PAIs considered are artificial eyes (Van Dyke eyes) and cosmetic lenses, evaluated on three different CNNs (VGG19, ResNet50, DenseNet121). By both intra- (known PAs) and cross-attack (unknown PAs) scenarios, OCT is determined as a viable solution, although hardware cost is still a limiting factor. Indeed, OCT outperforms NIR and VW in the intra-attack scenario, while NIR generalizes better to unseen PAs. Cosmetic lenses also appear to be more difficult to detect than artificial eyes with any modality. Via heatmaps, it is seen as well that the fixation regions are different for each imaging modality and for each PAI, which could be a source of complementarity.
Zhang et al. (Zhang et al., 2021) proposed a Weighted Region Network (WRN) to detect cosmetic lenses that includes a local attention Weight Network (for evaluating the discriminating information of different regions) and a global classification Region Network (for characterizing global features). Such strategy considers both the entire image and the attention effect by assigning different weights to regions. The mentioned networks are applied to a VGG16 backbone. The reported results showed improved performance compared to the state-of-the-art over three different databases.
The works by Agarwal et al. (Agarwal et al., 2022b, a) evaluated the detection of contact lenses. In (Agarwal et al., 2022b), they trained a siamese CNN of 5 convolutional layers on two different inputs (the original image and its CLAHE version), which are then combined by weighted score fusion of the softmax layer. Adding a processed version of the raw image attempts to enhance the feature extraction capabilities of the CNN. A similar strategy is followed in (Agarwal et al., 2022a), but here they used a siamese contraction-expansion CNN, and the processed image is a edge-enhanced image obtained via Histogram of Oriented Gradients (HOG). Another difference was the use of feature-level fusion of the next-to-last CNN feature vectors, testing different strategies (vector addition, multiplication, concatenation and distance). The papers employed several databases, with an extensive protocol including unseen subjects, environments (indoor vs outdoor) and databases (sensors) that showcases the strength of the solutions against cross-domain changes. The methods also showed superiority against popular CNN models (VGG16, ResNet18, DenseNet) and the popular LBP and HOG hand-crafted features.
Gautam et al. (Gautam et al., 2022) proposed a Deep Supervised Class Encoding (DSCE) approach consisting of an Autoencoder that exploits class information, and minimizes simultaneously the reconstruction and classification errors during training. Three datasets were used, containing textured lenses, printouts and synthetic images, showing superiority over a variety of hand-crafted and deep-learned features.
Tapia et al.(Tapia et al., 2022) used a two-stages serial architecture based on a modified MobiletNetv2. A first network was trained to only distinguish two classes (bona fide vs attack). If it votes bona fide, the image is sent to a second network trained to classify it among three or four classes (bona fide or a different type of PAI: contact lenses, printout, or cadaver). Four databases were combined to obtain a super-set with the different PAIs, and class-weights were also incorporated into the loss to compensate imbalance. The paper applied contrast enhancement (CLAHE), and an aggressive data augmentation (rotation, blurring, contrast change, Gaussian noise, edge enhancement, image region dropout, etc.). They tested two image sizes, 224224 and 448448, observing that the extra detail of a higher resolution image results in more effective features. The paper also carried out leave-one-out PAI tests for open-set evaluation, showing robustness in detecting unknown attacks.
4.3. Hybrid Methods
Choudhary et al. (Choudhary et al., 2022b, a) applied a Friedman test-based selection method to identify the best features of a set of hand-crafted and deep-learned ones. Each feature method feeds a SVM classifier, and the scores of the individual SVMs are fused via weighted sum. A preliminary version of (Choudhary et al., 2022a) without feature selection appeared in (Choudhary et al., 2021). The databases of (Choudhary et al., 2022b) include a medley of different PA (printouts, synthetic irises, artificial eyeballs, etc.), although the feature selection and classification methods are trained and evaluated separately on each database. The authors observed a saturation after a certain number of features are combined, and a superiority of the score-level fusion over other methods such as majority voting, feature-level fusion, and rank-level fusion. The work (Choudhary et al., 2022a), on the other hand, concentrated on textured contact lenses attack, with an extensive set of evaluations including single sensor, cross-sensor and combined sensor experiments. Apart from the generic live vs attack scenario, it also reports binary and ternary classification across the different types of real (normal iris, soft lens) and fake (textured) classes. Naturally, the cross-sensor error is larger compared to single-sensor, and the combined sensor error is also observed to be slightly larger. The latter is attributed to the larger intraclass variation created when images from different sensors are combined. In any case, an improvement of performance over previous works with the three datasets employed is observed after the proposed feature selection and score-level fusion method.
4.4. Adversarial Networks
Generative methods have been used by some approaches, either to use the trained discriminator for iris PAD, or to generate synthetic samples and augment under-represented classes. In this direction, Yadav and Ross (Yadav and Ross, 2021) proposed CIT-GAN (Cyclic Image Translation Generative Adversarial Network) for multi-domain style transfer to generate synthetic samples of several PAIs (cosmetic contact lenses, printed eyes, artificial eyes and kindle-display attack). To do so, image translation is driven by a Styling Network that learns style characteristics of each given domain. It also employs a Convolutional Autoencoder in the generator for image-to-image style translation, which takes a domain label as input along with an image. This is different than previous works of the same authors (Yadav et al., 2020, 2019a) which employed the traditional generator/discriminator approach driven by a noise vector. Different PAD methods using hand-crafted (BSIF, DESIST) and deep features (VGG16, D-NetPAD, AlexNet) were evaluated, demonstrating that they can be improved by adding synthetically generated data. The quality of synthetic images is also superior to a competing generative method (Star-GAN v2), measured via FID score distributions.
4.5. Open Research Questions in Iris PAD
One of the open research issues is to design robust iris PAD methods with cross-sensor and cross-database capabilities, so they generalize to unseen imaging conditions. Attackers are constantly developing new attack methodologies to circumvent PAD systems, so an even more important issue is unseen PAIs (i.e. cross-PAI capabilities) (Sharma and Selwal, 2021). Great results have been achieved on detecting known attack types (known as closed-set recognition), although cross-database evaluation (training in one database an testing in others) still appears as a difficult challenge due to changes in sensors, acquisition environments, or subjects. Moreover, generalizing to attacks that are unknown at the time of training (open-set recognition) is even a greater challenge for state-of-the-art methods (Fang et al., 2021b). Part of the problem lies into the limited size of existing databases, which is an issue for data-hungry DL approaches. Some solutions, as studied by some of the methods above, are data augmentation by geometric or illumination modifications (Fang et al., 2021d), or creating additional synthetic data via generative methods (Yadav and Ross, 2021). Human-aided DL training is another promising avenue. Indeed, humans and machines cooperating in vision tasks is not new, and this strategy is finding its way into DL as well (Boyd et al., 2021, 2022). For example, Boyd et al. (Boyd et al., 2022) analyzed the utility of human judgement about salient regions of images to improve generalization of DL models. Asked about regions that humans deem important for their decision about an image, the work proposed to transform the training data to incorporate such opinions, demonstrating an improvement in accuracy and generalization in leave-one-attack-type-out scenarios. In a similar work, Boyd et al. (Boyd et al., 2021) incorporated annotated saliency maps into the loss function to penalize large differences with human judgement.
Recently, concerns have emerged about the observed bias of DL methods that leads to discriminatory performance differences based on the user´s demographics, with face biometrics being the most talked-about and many companies and authorities banning its use (Jain et al., 2021). Obviously, this issue appears in iris PAD as well, as addressed by Fang et al. (Fang et al., 2021e). Using three baselines based on hand-crafted and DL approaches and a database of contact lenses, the authors showed a significant difference in the performance between male and female samples. In dealing with this phenomenon, examination of biases towards eye color or race are another directions worthwhile to consider.
Some elements considered as PAIs in this section, such as cosmetic lenses, may be worn normally by users without the purpose of fooling the biometric system, as it is the case of facial retouching via make-up, digital beautification or augmented reality (Hedman et al., 2021). This poses the question of whether it is possible to use such images for authentication, while diminishing the effect in the recognition performance. Suggested alternatives have been to detect and match portions of live iris tissue still visible (Parzianello and Czajka, 2022) or incorporate ocular information of the surrounding area (Alonso-Fernandez and Bigun, 2016). Unfortunately, in iris biometrics, recognition with textured contact lenses remains a hard problem to solve.
Another under-researched task is iris PAD in the visible spectrum. The majority of studies and datasets (Section 8) employ near-infrared illumination and specific iris close-up sensors. However, in some environments such as mobile or distant capture, such sensing is not guaranteed (Nigam et al., 2015).
Category | Method | Year | Data | Datasets | Features | |
NIR | VW | |||||
Feature Extraction | Fang et al. (Fang et al., 2020a) | 2020 | ✓ | ✗ | LivDet-2017 (IIITD-WVU, ND- CLD) | VGG16, MobileNetv3-small (multi-layer features) + PCA + SVM |
End-to-end Training | Arora and Bathia (Arora and Bhatia, 2020) | 2020 | ✓ | ✗ | LivDet-2017 (IIITD-WVU) | CNN with patch input |
Peng et al. (Peng et al., 2020) | 2020 | ✓ | ✗ | IPITRT, CASIA-Iris-v4, CASIA-Iris-Fake | LAILNet lightweight CNN | |
Sharma and Ross (Sharma and Ross, 2020) | 2020 | ✓ | ✗ | Proprietary, LivDet-2017 (IIITD-WVU, ND-CLD, Warsaw, Clarkson) | DenseNet121 pre-trained on ImageNet | |
Chen and Ross (Chen and Ross, 2021) | 2021 | ✓ | ✗ | JHU-APL, LivDet-2017 (Warsaw, ND-CLD) | DenseNet121 pre-trained on ImageNet + AG-PAD channel and spatial attention | |
Fang et al. (Fang et al., 2021b) | 2021 | ✓ | ✗ | LivDet-2017 (IIITD-WVU, ND-CLD), ND-CLD-15, | MobileNetv3-small with micro-stripes | |
Fang et al. (Fang et al., 2021c) | 2021 | ✓ | ✗ | LivDet-2017 (IIITD-WVU, ND-CLD, Clarkson), ND-CLD-13, ND-CLD-15, IIITD-CLI | DenseNet + A-PBS spatial attention | |
Fang et al. (Fang et al., 2021d) | 2021 | ✓ | ✗ | LivDet-2017 (IIITD-WVU, ND-CLD, Clarkson) | ResNet50, VGG16, MobileNetv3-small | |
Gupta et al. (Gupta et al., 2021) | 2021 | ✓ | ✗ | MUIPA, UnMIPA, IIITD-CLI | CNN with multi-branch classification | |
Sharma and Ross (Sharma and Ross, 2021) | 2021 | ✓ | ✓ | OCT, NIR and VW images | VGG19, ResNet50, DenseNet121 | |
Zhang et al. (Zhang et al., 2021) | 2021 | ✓ | ✗ | ND-CLD-13, CASIA-Iris-Fake, IF-VE | VGG16 + WRN local attention and global classification | |
Agarwal et al. (Agarwal et al., 2022a) | 2022 | ✓ | ✗ | MUIPA, UnMIPA, IIITD-CLI, LivDet-2017 (IIITD-WVU), ND-PSID | Siamese contraction-expansion CNN, feature fusion | |
Agarwal et al. (Agarwal et al., 2022b) | 2022 | ✓ | ✗ | MUIPA, UnMIPA, IIITD-CLI, LivDet-2017 (IIITD-WVU), ND-PSID, NDIris3D | Siamese CNN, score fusion | |
Gautam et al. (Gautam et al., 2022) | 2022 | ✓ | ✗ | SYN, IIITD-CLI, IIITD-IS | Autoencoder with reconstruction and classification loss | |
Tapia et al. (Tapia et al., 2022) | 2022 | ✓ | ✓ | LivDet-2020, Iris-CL1, Warsaw-Post-Mortem v3.0 | MobileNetv2, data augmentation, class-weights | |
Hybrid Methods | Choudhary et al. (Choudhary et al., 2022b) | 2022 | ✓ | ✗ | IIITD-CLI, ND-CLD-13, CASIA, LivDet-2017 (IIITD-WVU, ND-CLD, Clarkson) | MBISF (domain-specific filters), SIFT, Haralick, DenseNet, VGG8 + SVM classification |
Choudhary et al. (Choudhary et al., 2022a) | 2022 | ✓ | ✗ | IIITD-CLI, ND-CLD-13, LivDet-2017 (Clarkson) | MBSIF (generic filters), MBSIF (domain-specific filters), SIFT, LBPV, DAISY, DenseNet121 + SVM classification | |
Adversarial Networks | Yadav and Ross (Yadav and Ross, 2021) | 2021 | ✓ | ✗ | Casia-Iris-Fake, Berc-iris-fake, ND-CLD-15, LivDet-2017, MSU-IrisPA-01 | BSIF, DESIST, VGG16, D-NetPAD, AlexNet |
5. Deep Learning-Based Forensic Iris Recognition
Iris recognition has become the next biometric mode (in addition to face, fingerprints and palmprints) considered for large-scale forensic applications (FBI Criminal Justice Information Services (CJIS) Division, 2021), and coincides in time with discoveries made in recent years about possibility to employ iris in recognition of deceased subjects. This includes both matching of iris patterns acquired a few hours after death with those with longer PMIs (Post-Mortem Intervals), ranging from days (Sauerwein et al., 2017; Bolme et al., 2016; Trokielewicz et al., 2016b, a) to several weeks after demise (Trokielewicz et al., 2019; Boyd et al., 2020b), as well as matching patterns acquired before death with those collected post-mortem (Sansola, 2015).



Due to decomposition changes to the eye tissues, post-mortem iris images differ significantly from live iris images and rarely meet ISO/IEC 29794-6 quality requirements, as shown in Fig. 3(a). The challenges are related to appropriate detection of places when cornea dries and generates irregular and large specular highlights, as well regions where iris muscle furrows show up when the eyeball dehydrates. This is where DL-based methods may win over hand crafted approaches, as the latter usually make strong assumptions about anatomy of the iris appearance, not possible to be predicted for eyes undergoing random decomposition processes. Trokielewicz et al. proposed the first known to us iris recognition method designed specifically to cadaver irises (Trokielewicz et al., 2020, 2020). It incorporates SegNet-based segmenter and Siamese networks-based feature extractor, both trained in a domain-specific way solely on post-mortem iris samples. An interesting element of this approach is that segmetation incorporates two models: one trained with “fine” ground truth masks, marking all details associated with eye decomposition, and “coarse” model, aiming at detecting iris annulus and eyelids, as in classical iris recognition approaches. This allowed to apply a standard “rubber sheet” iris images normalization based on “coarse” masks, and at the same time exclude decomposition-driven artifacts from encoding, marked by the “fine” mask. Kuehlkamp et al. (Kuehlkamp et al., 2022) in addition to detecting post-mortem deformations, as shown in Fig. 3(c), they also proposed a human-interpretable visualization of a classification process. The visualization is based on Class Activation Mapping mechanism (Zhou et al., 2016) and highlights salient features used by the classifier in its judgment. This novelty in iris recognition algorithms may help human examiners to locate iris regions that should be carefully inspected, or to verify the algorithm’s decision.
6. Human-Machine Pairing to Improve Deep Learning-Based Iris Recognition
Iris recognition is usually associated with automatic, solely machine-based and rapid biometric means. It has been changing in the recent decade due to constantly increasing ubiquitousness of iris recognition, especially owing to large governmental applications such as (Unique Identification Authority of India, 2021) or FBI’s Next Generation Identification System (NGI) gradually replacing the Integrated Automated Fingerprint Identification System (IAFIS) (FBI Criminal Justice Information Services (CJIS) Division, 2021). This combined with unique identification power of iris whetted the appetite to apply this technique to identification problems normally reserved for fingerprints and face: forensics, lost subjects search or post-mortem identification. To have the legal power, however, the judgment about samples originating or not from the same eye conclusion must be confirmed by a trained human expert. And here is the place where DL-based iris image processing may play a useful role.
Trokielewicz et al. compared iris images in post-mortem iris recognition between humans and machines. They investigated which iris image regions humans and machines mainly attend to compare a pair of images. The machine-based attention maps are generated by Grad-CAM to highlight the regions that contribute the most to the deep learning model’s prediction. The human-based attention maps are learned by tracking the gaze as the human is looking around the screen that display iris image pairs and recording the regions where the human spend most time on. Interestingly while humans and machines tend to focus on a limited number of iris areas, however, the region, appearance and density of these areas between humans and machines are different. As salient regions proposed by the deep learning model and identified from human eye gaze do not overlap in general, the computer-added visual cues may potentially constitute a valuable addition to the forensic examiner’s expertise, as it can highlight important discriminatory regions that the human expert might miss in their proceedings. This human-machine pairing is important as human subjects can provide an incorrect decision even despite spending quite sometime observing many iris regions (NIST, 2021). In addition, there has been a body of research showing that humans and machines do not perform similarly well under different conditions (Stark et al., 2010; Chen et al., 2016; Moreira et al., 2019). For example, Moreira et al. also showed that machines can outperform humans in healthy easy iris image pairs; however, humans outperform machines in disease-affected iris image pairs (Moreira et al., 2019). Human-machine pairing will improve deep learning based iris recognition.
7. Recognition in Less Controlled Environments: Iris/Periocular Analysis
Rooted in the seminal work due to Park et al. (Park et al., 2011), efforts have been paid to the development of human recognition methods that - apart the iris - also consider information in the vicinity of the eye to infer the identity. This is a relatively recent topic, termed as periocular recognition. The rationale is that the periocular region represents a trade-off between the face and the iris. Periocular biometrics has been claimed to be particularly useful in environments that produce poor quality data (e.g., visual surveillance). Recently, as in the case of iris, several DL-based solutions have been proposed.
Hernandez-Diaz et al. (Hernandez-Diaz et al., 2018) tested the suitability of off-the-shelf CNN architectures to the periocular recognition task, observing that albeit such networks are optimized to classify generic objects, their features still can be effectively transferred to the periocular domain.
In the visual surveillance context, Kim et al. (Kim et al., 2018) infer subjects identities based either in loose/tight regions-of-interest, depending of the perceived image quality. Hwang and Lee (Hwang and Lee, 2020)prevents the loss of mid-level features and dynamically selects the most important features for classification. Luo et al. (Luo et al., 2021) used self-attention channel and spatial mechanisms into the feature encoding module of a CNN, in order to obtain the most discriminative features of the iris and periocular regions.
Jung et al. (Jung et al., 2020)’s work is based in the concept of label smoothing regularization (LSR). Having as main goal to reduce the intra-class variability, they described a so-called Generalized LSR (GLSR) by learning a pre-task network prediction that is claimed to improve the permanence of the obtained periocular features. Having similar purposes, Zanlorensi et al. (Zanlorensi et al., 2020) described a preprocessing step based in generative networks able to compensate for the typical data variations in visual surveillance environments. Nie et al. (Nie et al., 2014) applied convolutional restricted Boltzmann machines to the periocular recognition problem. Starting from a set of genuine pairs that are used as a constraint, a Mahalanobis distance-metric is learned.
Obtaining auxiliary (e.g., soft biometrics) has been seen as an interesting direction for compensating the lack of image quality. Zhao and Kumar (Zhao and Kumar, 2018) incorporate an attention model into a DL-architecture to emphasize the most important regions in the periocular data. The same authors (Zhao and Kumar, 2017a) described a semantics-assisted CNN framework to infer comprehensive periocular features. The whole model is composed of different networks, trained upon ID and semantic (e.g., gender, ethnicity) data, that are fused at the score and prediction levels. Similarly, Talreja et al. (Talreja et al., 2022) described a multi-branch CNN framework that predicts simultaneously soft biometrics and ID labels, which are finally fused into the final response.
With regard to cross-spectral settings, Hernandez-Diaz et al. (Hernandez-Diaz et al., 2020) used conditional GANs (CGANs) to convert periocular images between domains, that are further fed to intra-domain off-the-self frameworks. Sharma et el. (Sharma et al., 2014) described a shallow neural architecture where each model learns the data features in each spectrum. Then, at a subsequent phase, all models are jointly fine tuned, to learn the cross-spectral variability and correspondence features.
Finally, several works have attempted to faithfully fuse the scores/responses from iris and periocular data. Wang and Kumar (Wang and Kumar, 2021) used periocular features to adaptively match iris data acquired in less constrained conditions. Their framework incorporates such discriminative information using a multilayer perceptron network. Zhang et al. (Zhang et al., 2018) described a DL-model that exploits complementary information from the iris and the periocular regions, that applies maxout units to obtain compact representations for each modality and then fuses the discriminative features of the modalities through weighted concatenation. In an opposite direction, Proença and Neves (Proença and Neves, 2018) argued that the periocular recognition performance is optimized when the components inside the ocular globe (the iris and the sclera) are simply discarded.
Name | Data | Size | # IDs | # Samples | # Sessions | Features |
---|---|---|---|---|---|---|
BATH (Monro et al., 2007) | NIR | 1280960 | 1600 | 16000 | 1 | High quality images |
BioSec (Fierrez et al., 2007) | NIR | 640480 | 400 | 3200 | 2 | Office environment |
Biosecure (Ortega et al., 2010) | NIR | 640480 | 1334 | 2668 | 2 | Office environment |
CASIA-Cross-Sensor (Xiao et al., 2013) | NIR | n/a | 700 | 21000 | 1 | Multi-sensor, multi-distance (12-30cm, 3-5m) |
CASIA-Iris-Distance (Dong et al., 2009) | NIR | 23521728 | 284 | 2567 | 1 | Distant acquisition |
CASIA-Iris-Interval (Ma et al., 2003) | NIR | 320280 | 395 | 2639 | 2 | High quality images |
CASIA-Iris-Lamp (Wei et al., 2007) | NIR | 640480 | 819 | 16212 | 1 | Non-linear deformation |
CASIA-Iris-M1-S1 (Zhang et al., 2015) | NIR | 19201080 | 140 | 1400 | 1 | Mobile device |
CASIA-Iris-M1-S2 (Zhang et al., 2016b) | NIR | 19681024 | 400 | 6000 | 1 | Mobile device, multi-distance (20,25,30cm) |
CASIA-Iris-M1-S3 (Zhang et al., 2018) | NIR | 19201920 | 720 | 3600 | 1 | Mobile device |
CASIA-Iris-Thousand (Zhang et al., 2010) | NIR | 640480 | 2000 | 20000 | 1 | High quality images |
DCME01 (Boyd et al., 2020b) | NIR, VW | n/a | 254 | 621 | 1-9 | - |
DCME02 (Kuehlkamp et al., 2022) | NIR | n/a | 259 | 5770 | 1-53 | - |
IITD (Kumar and Passi, 2010) | NIR | 320240 | 224 | 1120 | 1 | Varying quality |
Iris-Mobile (Odinokikh et al., 2019) | NIR | n/a | 750 | 22966 | n/a | Mobile device, indoor & outdoor |
JluIrisV3.1 (Zhao et al., 2019) | NIR | 640480 | 120 | 1780 | n/a | - |
JluIrisV4 (Zhao et al., 2019) | NIR | 640480 | 172 | 114904 | n/a | - |
LivDet-2013-Warsaw (Czajka, 2013) | NIR | 640480 | 284 | 1667 | 1 | High quality images |
MICHE-I (De Marsico et al., 2015) | VW | var. | 184 | 3732 | 2 | Three mobile devices |
MMU | NIR | 320240 | 92 | 460 | 1 | High quality images |
MobBIOfake (Sequeira et al., 2014) | VW | 300200 | 200 | 1600 | 1 | With a handheld device |
ND-CrossSensor-2013 (Xiao et al., 2013) | NIR | 640480 | 1352 | 146550 | 27 | Multi-sensor |
ND-Iris-0405 (Phillips et al., 2010) | NIR | 640480 | 712 | 64980 | 1 | Varying quality |
ND-TWINS-2009-2010 | VW | n/a | 435 | 24050 | n/a | Facial pictures frontal, 3/4 and side views. Indoor & outdoor |
OpenEDS (Garbin et al., 2019) | NIR | 640400 | 304 | 356649 | 1 | From head-mounted VR glasses |
Q-FIRE (Johnson et al., 2010) | NIR | var. | 390 | 586560 | 2 | Iris/face Videos, various distances and quality |
UBIRIS.v1 (Proença and Alexandre, 2005) | VW | 800600 | 241 | 1877 | 2 | Several noise factors |
UBIRIS.v2 (Proenca et al., 2010) | VW | 400300 | 522 | 11102 | 2 | Distant acquisition, on the move |
Warsaw (Boyd et al., 2020b) | NIR, VW | n/a | 157 | 4866 | 1-13 | - |
Warsaw-Post-Mortem v1.0 (Trokielewicz et al., 2016a) | NIR, VW | var. | 34 | 1330 | 2-3 | Deceased persons, 5-7h to 17 days postmortem |
Warsaw-Post-Mortem v2.0 (Trokielewicz et al., 2019) | NIR, VW | var. | 73 | 2987 | 1-13 | Deceased persons |
8. Open-Source Deep Learning-Based Iris Recognition Tools
Here we summarize the main properties of the datasets employed by the methods of the previous sections for DL-based iris segmentation, recognition and PAD. We also describe available open-source software code for these tasks, and other relevant tools.
8.1. Data Sources
Table 4 gives the technical details of the datasets used in the segmentation and recognition methods of Tables 1 and 2. Table 5 does the same for the iris PAD methods of Table 3. We show the main properties (spectrum, image size, identities, images, sessions) and relevant features. Only the datasets of the methods reported in previous section are presented. Since we focus on the most recent developments, we consider that such approach provides the most relevant datasets for each task. Of course, the list of available datasets after decades of iris research is much longer (Omelina et al., 2021).
A first observation is the dominance of near infrared (NIR) over the visible (VW) spectrum, which should not be surprising, since NIR is regarded as most suitable for iris analysis. However, research-wise, many segmentation and recognition studies (Tables 1, 2) use VW images, pushed by the success of challenging databases such as MICHE and UBIRIS. On the contrary, the VW modality in iris PAD research is residual (Table 3), a tendency also observed in pre-DL research (Czajka and Bowyer, 2018; Boyd et al., 2020a).
When it comes to the types of Presentation Attack Instruments (PAIs) employed in iris PAD databases, they can be categorized into:
-
•
PP: paper printout of a real iris image, i.e. from a live person
-
•
PPD: paper printout of a real iris image with a transparent 3D plastic eye dome on top
-
•
CLL: textured contact lenses worn by a live person
-
•
CLP: textured contact lenses on printout (either a printout of a CLL image, or a printout of a real iris image with a textured contact lens placed on top)
-
•
RA: replay attack, i.e. a real iris image shown on a display
-
•
AE: artificial eyeball (plastic eyes of two different types: Van Dyke Eyes, with higher iris quality details, and Scary eyes, plastic fake eyes with a simple pattern on the iris region)
-
•
AEC: artificial eyeball with a textured contact lens on top
-
•
SY: synthetic iris, i.e. an image created via generative methods
-
•
PM: postmortem iris, i.e. an image acquired from cadaver eyes
These PAIs mostly entail presenting the mentioned instrument to the iris sensor, which then captures an image of the artifact. An exception is “SY”, which directly produces a synthetic digital image, although such image could be used as base to, for example, PP, PPD, RA, or AE attacks. In Table 5, it can be seen that CLL (textured lenses live) and PP (paper printouts) largely dominates as the most popular PAIs on the existing databases, and consequently, on the related research (Table 3). CLP (textured lenses on printout) also appears in many studies, driven by the wide use of the LivDet-2017-IIITD-WVU set, which includes such PAI. CASIA-Iris-Fake, which contains AE (artificial eyes) and SY (synthetic irises) also appears in a few studies. Other attacks that one may expect on the digital era, such as RA (replay), however, are residual in datasets and recent studies.
Name | PAIs | Data | Size | # IDs | # Samples | TTP | Features | |||
live | fake | live | fake | total | ||||||
CASIA-Iris-Fake (Sun et al., 2014) | PP, CLL, AE, SY | NIR | 640480 | 1000 | 815 | 6000 | 4120 | 10240 | ||
IF-VE (Zhang et al., 2021) | CLL | NIR | n/a | 200 | 200 | 25000 | 25000 | 50000 | ✓ | MS, ME |
IPITRT (Peng et al., 2020) | PP | NIR | var. | 58 | n/a | 1800 | 551 | 2351 | ME | |
IIITD-CLI (Kohli et al., 2013) | CLL | NIR | 640480 | 202 | n/a | n/a | n/a | 6570 | ✓ | MS |
IIITD-IS3 (Gupta et al., 2014) | PP, CLP | NIR | 640480 | 202 | n/a | 0 | 4848 | 4848 | MS | |
LivDet-2017 (Yambay et al., 2017) | ||||||||||
-Clarkson | PP, CLL | NIR | 640480 | 50 | n/a | 3954 | 4141 | 8095 | ✓ | UPAI (additional patterned lenses) |
-IIITD-WVU1 | PP, CLL, CLP | NIR | 640480 | n/a | n/a | 2952 | 4507 | 7459 | ✓ | MS, ME, UPAI (additional patterned lenses) |
-ND-CLD2 | CLL | NIR | 640480 | n/a | n/a | 2400 | 2400 | 4800 | ✓ | UPAI (additional patterned lenses) |
-Warsaw | PP | NIR | 640480 | 457 | 446 | 5168 | 6845 | 12013 | ✓ | MS |
LivDet-2020 (Das et al., 2020) | PP, PPD, CLL, CLP, RA, AE, AEC, PM | NIR | 640480 | n/a | n/a | 5331 | 7101 | 12432 | MS | |
Iris-CL1 (Tapia et al., 2022) | PP | NIR | var. | n/a | n/a | n/a | 1800 | n/a | MS | |
JHU-APL (Chen and Ross, 2021) | CLL, AE | NIR | n/a | n/a | n/a | 7191 | 7214 | 14405 | ME | |
MSU-IrisPA-01 (Yadav et al., 2019a) | PP, CLL, RA, AE | NIR | 640480 | n/a | n/a | 1343 | 2523 | |||
MUIPA (Yadav et al., 2018) | PP, CLL | NIR | 640480 | 70 | 70 | n/a | n/a | 10296 | ME | |
ND-CLD-13 (Doyle et al., 2013) | CLL | NIR | 640480 | 330 | n/a | 3400 | 1700 | 5100 | ✓ | MS |
ND-CLD-152 (Doyle and Bowyer, 2015) | CLL | NIR | 640480 | n/a | n/a | 4800 | 2500 | 7300 | ✓ | MS |
NDIris3D (Fang et al., 2021a) | CLL | NIR | 640480 | 176 | 176 | 3458 | 3392 | 6850 | MS | |
ND-PSID4 (Czajka et al., 2019) | CLL | NIR | 640480 | 238 | 238 | 3132 | 2664 | 5796 | ||
UnMIPA (Yadav et al., 2019b) | CLL | NIR | 640480 | 162 | 162 | 9319 | 9387 | 18706 | MS, ME | |
Warsaw-Post-Mortem v3.0 (Trokielewicz et al., 2020) | PM | NIR, VW | var. | 0 | 79 | 0 | 1879 | 1879 | MS | |
1 Contains IIITD-CLI and IIITD-IS | ||||||||||
2 Iris-LivDet-2017-ND-CLD is a subset of ND-CLD-15 | ||||||||||
3 IIITD-IS images are printouts of IIITD-CLI captured with a iris scanner and a flatbed scanner | ||||||||||
4 ND-PSID is a subset of ND-CLD-15 |
8.2. Software Tools
The availability of DL-based tools for iris biometrics has been scarce for years, specially for PAD (Fang et al., 2021a). In the following, we provide a short description of peer-reviewed references with associated source code (link included in the paper, or easily found on the websites of the authors or dedicated sites such as www.paperswithcode.com). We describe (in this order) tools for segmentation, recognition and PAD. For each type, the references are then presented in cronological order.
8.2.1. Segmentation
Lozej et al. (Lozej et al., 2018) released their end-to-end DL model based on the U-Net architecture (Ronneberger et al., 2015). The model was trained and evaluated with a small set of 200 annotated iris images from CASIA database. The authors also explored the impact of the model depth and the use of batch normalization layers.
Kerrigan et al. (Kerrigan et al., 2019) released the code and models of Iris-recognition-OTS-DNN, a set of four architectures based on off-the-shelf CNNs trained for iris segmentation (two VGG-16 with dilated convolutions, one ResNet with dilated kernels, and one SegNet encoder/decoder). Training databases included CASIA-Irisv4-Interval, ND-Iris-0405, Warsaw-Post-Mortem v2.0 and ND-TWINS-2009-2010, whereas testing data came from ND-Iris-0405 (disjoint subject), BioSec and UBIRIS.v2. Results showed that the DL solutions evaluated outperform traditional segmentation techniques, e.g. Hough transform or integro-differential operators. It was also seen that each test dataset had a method that performs best, with UBIRIS obtaining the worst performance. This should not come as a surprise, since it contains VW images with high variability taking distantly with a digital camera, whereas the other two are from close-up NIR iris sensors in controlled environments.
Wang et al. (Wang et al., 2020a) released the code and models of their high-efficiency segmentation approach, IrisParseNet. A multi-task attention network was first applied to simultaneously predict the iris mask, pupil mask and iris outer boundary. Then, from the predicted masks and outer boundary, a parameterization of the iris boundaries was calculated. The solution is complete, in the sense that the mask (including light reflections and occlusions) and the parameterized inner and outer iris boundaries are jointly achieved.
More recently, authors from the same group presented IrisSegBenchmark (Wang and Sun, 2020), an open iris segmentation evaluation benchmark where they implemented six different CNN architectures, including Fully Convolutional Networks (FCN) (Long et al., 2015), Deeplab V1,V2,V3 (Chen et al., 2017), ParseNet (Liu et al., 2016a), PSPNet (Zhao et al., 2017), SegNet (Badrinarayanan et al., 2017), and U-Net (Ronneberger et al., 2015). The methods were evaluated on CASIA-Irisv4-Distance, MICHE-I and UBIRIS.v2. As in (Kerrigan et al., 2019), results showed that the best method depends on the database, being: ParseNet for CASIA (NIR data), DeeplabV3 for MICHE (VW images from mobile devices), and U-Net for UBIRIS (VW images from a digital camera). In this case, however, the three tests databases behaved approximately equal, since they all contain difficult distant data. CASIA showed a slightly better accuracy, suggesting that NIR data may be easier to segment. Traditional, non-DL methods were also evaluated, concluding that DL-based segmentation achieves superior accuracy.
Banerjee et al. (Banerjee et al., 2022) released the code of their V-Net architecture, designed to overcome some drawbacks of U-Net, such as instability to tackle iris segmentation or tendency to overfit. A pre-processing stage on the YCrCb and HSV spaces was also added to detect salient regions and aid detection of iris boundaries. The method was evaluated on the difficult UBIRIS.v2 VW dataset.
8.2.2. Recognition
The code of the DL method ThirdEye was released by Ahmad and Fuller (Ahmad and Fuller, 2019), based on a ResNet-50 trained with triplet loss. Authors directly used segmented images without normalization to a rectangular 2D representation, arguing that such step may be counterproductive in unconstrained images. The model was evaluated on the ND-0405, IITD and UBIRIS.v2 datasets.
The models of Boyd et al. (Boyd et al., 2019) for recognition have been also released, based on a ResNet-50 with different weight initialization techniques, comprising: from scratch (random), off-the-shelf ImageNet (general-purpose vision weights), off-the shelf VGGFace2 (face recognition weights), fine-tuned ImageNet weights, and fine-tuned VGGFace2 weights. Both ImageNet and VGGFace2 are very large datasets with millions of images, and face images contain the iris region. Thus, using these datasets as initialization may be beneficial for iris recognition, where available training data is in the order of hundreds of thousand images only. This strategy has been followed e.g. in ocular soft-biometrics as well (Alonso-Fernandez et al., 2021). The observed optimal strategy is indeed to fine-tune an off-the-shelf set of weights to the iris recognition domain, be general-purpose or face recognition weights.
8.2.3. Segmentation and Recognition Packages
A complete package comprising segmentation and feature encoding was provided by Tann et al.(Tann et al., 2019). The segmentator is based on a Fully Convolutional Network (FCN), but encoding is based on hand-crafted Gabor filters (Daugman, 2007). Evaluation was done on CASIA-Irisv4-Interval and IITD.
In forensic investigation for diseased eyes and post-mortem samples, Czajka (Czajka, 2021) also released a complete package combining segmentation and feature encoding. The models are based on previous efforts of the author and co-workers, comprising a SegNet (Trokielewicz et al., 2020) and a CCNet (Mishra et al., 2019) DL segmentators, but the feature encoder is based on hand-crafted BSIF filters.
Another complete segmentation and recognition package was released by Kuehlkamp et al. (Kuehlkamp et al., 2022). The segmentator is based on a fine-tuned Mask-RCNN architecture, with the cropped iris region fed directly into a ResNet50 pre-trained for face recognition on the very large VGGFace2 dataset, and fine-tuned for iris recognition using triplet loss. The paper is oriented towards postmortem iris analysis, so the methods use a mixture of live and postmortem images for training and evaluation.
Parzianello and Czajka (Parzianello and Czajka, 2022) also released the models and annotated data for their textured contact lens aware iris recognition method. The foundation is that such lenses may be used normally for cosmetic purposes, without intention to fool the biometric system. Therefore, they proposed to detect and match portions of live iris tissue still visible in order to enable recognition even when a person wears textured contact lenses. To do so, they applied a Mask R-CNN as a segmentation backbone, trained to detect authentically-looking parts of the iris using manually segmented samples from NDIris3D dataset. Non-iris information is then removed from the training images by blurring it or replacing it with random noise to guide the subsequent recognition network (based on ResNet-18) to salient, non-occluded regions that should be used for matching.
8.2.4. Iris PAD
In the iris PAD arena, Gragnaniello et al. (Gragnaniello et al., 2016) proposed a CNN that incorporates domain-specific knowledge. Based on the assumption that PAD relies on residual artifacts left mostly in high-frequencies, a regularization term was added to the loss function which forces the first layer to behave as a high-pass filter. The method, which is available in the website of the first author, could be applied to PAD in multiple modalities, including iris and face.
The code and model of the method of Sharma and Ross (Sharma and Ross, 2020) (D-NetPAD) is also available. It is based on DenseNet121 and trained for a variety of PAIs (printouts, artificial eye, cosmetic contacts, kindle replay, and transparent dome on print), with an script to retrain the method also available.
8.3. Other Tools: Iris Image Quality Assessment
Several image properties considered to potentially influence the accuracy of iris biometrics have been defined in support of the standard ISO/IEC 29794-6 (technology — Biometric sample quality — Part 6: Iris image data, 2015). They include: grayscale spread (dynamic range), iris size (pixels across the iris radius when the boundaries are modeled by a circle), dilation (ratio of the pupil to iris radius), usable iris area (percentage of non-occluded iris, either by eyelashes, eyelids or reflections), contrast of pupil and sclera boundaries, shape (irregularity) of pupil and sclera boundaries, margin (distance between the iris boundary and the closest image edge), sharpness (absence of defocus blur), motion blur, signal to noise ratio, gaze (deviation of the optical axis of the eye from the optical axis of the camera), and interlace of the acquisition device.
Low quality iris images, which can potentially appear in uncontrolled or non-cooperative environments, are known to reduce the performance of iris location, segmentation and recognition. Thus, an accurate quality assessment can be a valuable tool in support of the overall pipeline, either by dropping low quality images, or invoking specialized processing (Alonso-Fernandez et al., 2012). One possibility might be to quantify the properties mentioned above, and placing thresholds on each. A more elaborated alternative is to combine them according to some rule and produce an overall quality score. However, it is difficult to provide metrics that cover all types of quality distortions (Tabassi et al., 2011) and doing so for some indeed entails to segment the iris.
Broadly, a biometric sample is of good quality if it is suitable for recognition, so quality should correlate with recognition performance (Grother and Tabassi, 2007). As such, quality assessment can be viewed as a regression problem. Wang et al. (Wang et al., 2020c) considered that a non-ideal eye image will pivot in the feature space around the embedding of an ideal image. They defined quality as the distance to the embedding of such “ideal” image which, is regarded as a registration sample collected under a highly controlled environment. They used a model to learn the mapping between images and Distance in Feature Space (DFS) directly from a given dataset. Quality is computed via attention-based pooling that combines a heatmap that comes from a coarse segmentation based on U-Net and the feature map of an extraction network based on MobileNetv2 pre-trained on CASIA-Iris-V4 and NDIRIS-0405.
9. Emerging Research Directions
In this section, we discuss the most relevant open challenges and hypothesize about emerging research directions that could become hot-topics in biometrics literature in a close future.
9.1. Resource-aware designs of iris recognition networks
Application-wise, iris recognition can be performed on a wide range of hardware, ranging from high-end computers to low-end embedded devices, or from large computer clusters to personal devices such as mobile phones. Performing recognition on resource-limited hardware could pose new challenges for deep learning based iris networks, which usually contain hundreds of layers and millions of parameters. Therefore designing these deep learning networks necessarily need to be aware of the hardware platforms on which they will be run.
Lightweight models: Lightweight CNNs employ advanced techniques to efficiently trade-off between resource and accuracy, minimising their model size and computations in term of the number of floating point operations (FLOPs), while retaining high accuracies. Specialized lightweight CNN architectures include MobileNets (Howard et al., 2019) and U-Net (Ronneberger et al., 2015). There are a few lightweight deep learning based models for both segmentation and feature extraction. Fang et al. (Fang and Czajka, 2020) adapted the lightweight CC-Net (Mishra et al., 2019) for iris segmentation. CC-Net has a U-Net structure (Ronneberger et al., 2015), able to retain up to 95% accuracy using only 0.1% of the trainable parameters. Boutros et al. (Boutros et al., 2020) benchmarked MobileNet-V3 against deeper networks for iris recognition and showed that the MobileNet based model can achieve similar EER with 85% less number of parameters and 80% less inference time.
Model compression: Studies have found that most of the large deep learning models tend to be overparameterized, leading to lots of redundant parameters and operations in the network. This becomes more severe considering iris texture images are different from generic object-based images. This has motivated a hot trend looking to remove these redundancies from the models, including pruning, quantization and low-rank factorization (Liang et al., 2021). In our iris recognition literature, there a few lightweight deep learning based models for both segmentation and feature extraction. Tann et al. (Tann et al., 2019) quantized 64-bit floating points numbers of weights and activations of the full FCN-based iris segmentation model using an 8-bit dynamic fixed-point (DFP) format, which provide a 8 memory saving as well as speed enhancement due to reduced complexity of lower precision operations.
Neural Architecture Search: Neural Architecture Search (NAS) automates the process of architecture design of neural networks by iteratively sampling a population of child networks, evaluating the child models’ performance metrics as rewards and learning to generate high-performance architecture candidates (Elsken et al., 2019). In our iris recognition literature, Nguyen et al. (Nguyen et al., 2020) showed that computation and memory can be incorporated into the NAS formulation to enable resource-constrained design of deep iris networks.
9.2. Human-interpretable methods and XAI
With hundreds of layers and millions of parameters, deep learning networks are usually opaque or “blackbox” where humans struggle to understand why a deep network predict what it predicts. This necessitates approaches to make deep learning methods more interpretable and understandable to humans. Interestingly, the need for human-interpretable methods has been raised even from the handcrafted era. For example, Shen et al. published a series of work (Chen et al., 2016; Shen and Flynn, 2013) on using iris crypts for iris matching. Iris crypts are clearly visible to humans in a similar way as finger minutiae. Another example is the macro-features (Sunder and Ross, 2010) which use SIFT to detect keypoints and perform iris matching based on these keypoints (Quinn, G. and Matey, J. and Grother, P. and Watters, E., 2022). Another notable work is by Proença et al. (Proença and Neves, 2017) where they proposed a deformation field to represent the correspondence between two iris images.
From a deep learning perspective, researchers have also attempted to visualize the matching. Kuehlkamp et al. (Kuehlkamp et al., 2022) argued that existing iris recognition methods offer limited and non-standard methods of visualization to let human examiners interpret the model output. They applied Class Activation Maps (CAM) (Zhou et al., 2016) to visualize the level of contribution of each iris region to the overall matching score. Similarly, Nguyen et al. (Nguyen et al., 2022) also decomposed the final matching score into pixel-level to visualize the level of contribution of each pixel to the overall matching score.
9.3. Deep learning-based synthetic iris generation
Data synthesis provides an alternative to time- and resource-consuming database collection. One could create as many images as desired, with new textures that even do not match any existing identity, which would avoid privacy problems too. On the other hand, fake irises that are indistinguishable from real ones can be used for identity concealment attacks (if the image does not match any identity) or impersonation attacks (if the image resembles an existing identity) (Czajka and Bowyer, 2018). Indeed, synthetic irises are present in databases employed for iris PAD, such as CASIA-Iris-Fake (Table 5).
Regardless of the purpose or ability to detect if an image is synthetic, Generative Adversarial Networks (GANs) (Goodfellow et al., 2014) have shown impressive photo-realistic generating capabilities in many domains. GANs learn to model image distributions by an adversarial process, where a discriminator assesses the realism of images synthesized by a generator. At the end, the generator have learned the distribution of the training data, being able to synthesize new images with the same characteristics.
For iris generation, some methods by Yadav et al. (Yadav and Ross, 2021; Yadav et al., 2020, 2019a) were mentioned in iris PAD contexts (Section 4.4). RaSGAN (Yadav et al., 2020, 2019a) followed the traditional approach of driving the generation/discrimination training by randomly sampling so-called latent vectors from a probabilistic distribution. As training progresses, the generator learns to associate features of the latent vectors with semantically meaningful attributes that naturally vary in the images. However, this does not impose any restriction in the relationship between features in latent space and factors of variation in the image domain, making difficult to decode what the latent vectors represent. As a result, the image characteristics (eye color, eyelids shape, eyelashes, gender, age…) are generated randomly. Kohli et al. (Kohli et al., 2017) presented iDCGAN for iris PAD, which also followed the latent vector sampling concept. To counteract such issue, researchers have tried to incorporate constrains or mechanisms that guide the generation process to a desired characteristic. For example, CIT-GAN (Yadav and Ross, 2021) employed a Styling Network that learns style characteristics of each given domain, while taking as input a domain label that drives the network to embed a desired style into the generated data.
In a similar direction, Kaur and Manduchi (Kaur and Manduchi, 2021, 2020) proposed to synthesize eye images with a desired style (skin color, texture, iris color, identity) using an encoder-decoder ResNet. The method is aimed at manipulating gaze, so the generator receives a segmentation mask with the desired gaze, and an image with the style that will see its gaze modified. To achieve cross-spectral recognition, Hernandez-Diaz et al. (Hernandez-Diaz et al., 2020) used CGANs to convert ocular images between VW and NIR spectra while keeping identity, so comparisons are done in the same spectrum. This allows the use of existing feature methods, which are typically optimized to operate in a single spectrum.
Despite great advances in DL-based synthetic image generation, one open problem is the possible identity leakage from the training set when creating data of non-existing identities, resulting in privacy issues. This has just been revealed recently in face generation (Tinsley et al., 2021). Another issue in the opposite direction is the difficulty in preserving identity in the generation process when the target is precisely creating images of an existing identity with different properties. This is an issue being addressed in face generation methods [reference under review], but is lacking in iris synthesis research.
9.4. Deep learning-based iris super-resolution
One of the main constraints for existing iris recognition systems is the short distance of image acquisition, which usually requires a subject to stay still less than 60 cm from iris cameras. This is due to the requirement of high-resolution iris region, e.g. 120 pixels across the iris diameter due to the European standard and NIST standard, despite the small physical size of an eye, i.e. mm. The lack of resolution of imaging systems has critically adverse impacts on the recognition and performance of biometric systems, especially in less constrained conditions and long range surveillance applications (Nguyen et al., 2018).
Super-resolution, as one of the core innovations in computer vision, has been an attractive but challenging solution to address the low resolution problem in both general imaging systems and biometric systems. Deep learning based super-resolution approaches have been across multiple works in iris recognition. Ribeiro et al. (Ribeiro et al., 2017; Ribeiro et al., 2019) experimented two deep learning single-image super-resolution approaches: Stacked Auto-Encoders (SAE) and Convolutional Neural Networks (CNN). Both approaches learn one encoder to map the high resolution iris images to the low resolution domain, and one decoder to learn to reconstruct the original high resolution images from the low resolution ones. Zhang et al. (Zhang et al., 2016a) learned a single CNN to learn non-linear mapping function between LR images to HR images for mobile iris recognition. Wang et al. (Wang et al., 2019a) extended the single CNN to two CNNs: one generator CNN and one discriminator CNN as in the GAN architecture. The generator functions similar to the single LR - HR mapping CNN. Adding the discriminator CNN allows them to control the generator to generate HR images not just visually higher resolution but also preserve the identity of the iris. Mostofa et al. (Mostofa et al., 2021) incorporated a GAN-based photo-realistic super-resolution approach (Ledig et al., 2017) to improve the resolution of LR iris images from the NIR domain before cross-matching the HR outputs with the HR images from the RGB domain. While these approaches showed improved performance, dealing with noisy data in such cases as iris at a distance and on the move could require the quality of an input iris image to be included in the super-resolution process (Nguyen et al., 2011). In addition, Nguyen et al. argued that a fundamental difference exists between conventional super-resolution motivations and those required for biometrics, hence proposing to perform super-resolution at the feature level targeting explicitly the representation used by recognition (Nguyen et al., 2012).
9.5. Privacy in deep learning-based iris recognition
Privacy is becoming a key issue in computer vision and machine learning domains. In particular, it is accepted that the accuracy attained by deep learning models depends on the availability of large amounts of visual data, which stresses the need for privacy-preserving recognition solutions.
In short, the goal in privacy preserving deep-learning is to appropriately train models while preserving the privacy of the training datasets. While the utility of this kind of solutions is obvious, there are certain concerns about the training data that supported the model creation, as the collection of images from a large number of individuals comes with significant privacy risks. In particular, it should be considered that the subjects from whom the data were collected can neither delete nor control what actually will be learned from their data.
As most of the existing biometric technologies, DL-based iris recognition pose challenges to privacy, which are even more concerning, considering the data-driven feature of such kind of systems. Particular attention should be paid to avoid function creep, guaranteeing that the system yielding from a set of data is not used for a different purpose than the originally communicated to the individual at the time of providing their information. Covert collection is another major concern, which is also particular important for the iris trait, according to the possibility of being imaged from large distances and in surreptitious way.
Particular attention has been paid to the development of fair recognition systems, in the sense that this kind of systems should attain similar effectiveness in different subgroups of the population, regarding different features such as gender, age, race or ethnicity. For data-driven systems, this might be a relevant challenge, considering that most of the existing datasets that support the learned systems have evident biases with regared tio the subjects’ characteristics above.
Lastly, in a more general machine learning perspective, potential attacks to the learned models have been concerning the research community and have been the scope of various recent works, attempting to provide defense mechanisms against: i) model inversion attacks, that aim to reconstruct the training data from the model parameters (e.g., (Khosravy et al., 2022) and (He et al., 2022)); ii) membership inference, that attempt to infer whether one individual was part of a training set (e.g., (Hu et al., 2022) and (Song et al., 2019)); and iii) training data extraction attacks, that aim to recover individual training samples by querying the models (e.g, (Khalid et al., 2019) and (Ding et al., 2022)).
9.6. Deep learning-based iris segmentation
Being one of the earliest phases of the recognition process, segmentation is known as one of the most challenging, as it is at the front line for facing the dynamics of the data acquisition environments. This is particularly true, in case of less constrained data acquisition protocols, where the resulting data have highly varying features and the particular conditions of each environment strongly determine the most likely data covariates.
In the segmentation context, the main challenge remains as the development of methods robust to cross-domain settings, i.e., able to segment the iris region for a broad range of image features, e.g., in terms of: 1) illumination, 2) scale, 3) gaze, 4) occlusions, 5) rotation and 6) pose, corresponding to the acquisition in very different environments. Over the past decades, many research groups have been devoting their attentions in improving the robustness of iris segmentation, which is known to be a primary factor for the final effectiveness of the recognition process. In this timeline, the proposed segmentation methods can be roughly grouped into three categories: 1) boundary-based methods (using the integro-differential operator or Hough transform); 2) based in handcrafted features (particularly suited for non-cooperative recognition, e.g., (Tan et al., 2010) and (Tan and Kumar, 2012)) ; and 3) DL-based solutions.
For the latter family of methods, the emerging trends are closely related to the general challenges of DL-based segmentation frameworks, namely to obtain interpretable models that allow us to perceive what exactly are these systems learning, or the minimal neural architecture that guarantees a predefined level of accuracy. Also, the development of weakly supervised or even unsupervised frameworks is another grand-challenge, as it is accepted that such systems will likely adapt better to previously unseen data acquisition conditions. Finally, the computational cost of segmentation (both in terms of space and time) is another concern, with special impact in the deployment of this kind of frameworks in mobile settings, and in the IoT setting (Saleh, 2018).
9.7. Deep learning-based iris recognition in visible wavelengths
Being a topic of study for over a decade (e.g. (Liu et al., 2019) and (Proença, 2013)), iris recognition in visible wavelengths remains essentially as an interesting possibility for delivering biometric recognition from large distances (in conditions that are typically associated to visual surveillance settings) and in handheld commercial devices, such as smartphones.
The emerging trends in this scope regard the development of alternate ways to analyze the multi-spectral information available in visible light data (typically RGB), i.e., by developing deep learning architectures optimized for fusion, either at the data, feature, score or decision levels (Bigdeli et al., 2021).
In the visual surveillance setting, the main challenge regards the development of optimized data acquisition settings, profiting from the advances in remote sensing technologies, that should be able to augment the quality (e.g., resolution and sharpness) of the obtained irises. In this scope, the research on active data acquisition technologies (based in PTZ devices, or similar) might also be an interesting emerging possibility (Han, 2021).
10. Conclusions
Motivated by the tremendous success of DL-based solutions for many different solutions to everyday problems, machine learning is entering one of its golden era, attracting growing interests from the research, commercial and governmental communities. In short, deep learning uses multiple layers to represent the abstractions of data to build computational models that - even in a bit surprising way - typically surpass the previous generation of handcrafted-based automata. However, being extremely data-driven, the effectiveness of DL-based solutions is typically constrained by the existence of massive amounts of data, annotated in a consistent way.
As in the generality of the computer-vision topics, a myriad of DL-based techniques has been proposed over the last years to perform biometric recognition, and - in particular - iris recognition. Nowadays, the existing methods cover the whole phases of the typical processing chain, from the preprocessing, segmentation, feature extraction up to the matching and recognition steps.
Accordingly, this article provides the first comprehensive review of the historical and state-of-the-art approaches in DL-based techniques for iris recognition, followed by an in-depth analysis on pivoting and groundbreaking advances in each phase of the processing chain. We summarize and critically compare the most relevant methods for iris acquisition, segmentation, quality assessment, feature encoding, matching and recognition problems, also presenting the most relevant open-problems for each phase.
Finally, we review the typical issues faced in DL-based methods in this domain of expertize, such as unsupervised learning, black-box models, and online learning and to illustrate how these challenges can be important to open prolific future research paths and solutions.
Acknowledgements.
We would like to thank Adam Czajka from the University of Notre Dame, USA for the contribution in the early version of this survey paper in Sections 1, 5 and 6. The work due to Hugo Proença was funded by FCT/MEC through national funds and co-funded by FEDER - PT2020 partnership agreement under the projects UIDB/50008/2020, POCI-01-0247-FEDER- 033395. Author Alonso-Fernandez thanks the Swedish Innovation Agency VINNOVA (project MIDAS and DIFFUSE) and the Swedish Research Council (project 2021-05110) for funding his research.References
- (1)
- Agarwal et al. (2022a) A. Agarwal, A. Noore, M. Vatsa, and R. Singh. 2022a. Enhanced iris presentation attack detection via contraction-expansion CNN. Pattern Recognition Letters 159 (2022), 61–69.
- Agarwal et al. (2022b) A. Agarwal, A. Noore, M. Vatsa, and R. Singh. 2022b. Generalized Contact Lens Iris Presentation Attack Detection. IEEE Transactions on Biometrics, Behavior, and Identity Science (2022), 1–1.
- Ahmad and Fuller (2019) S. Ahmad and B. Fuller. 2019. ThirdEye: Triplet Based Iris Recognition without Normalization. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). 1–9.
- Alonso-Fernandez and Bigun (2016) F. Alonso-Fernandez and J. Bigun. 2016. A survey on periocular biometrics research. Pattern Recognition Letters 82 (2016), 92–105.
- Alonso-Fernandez et al. (2012) F. Alonso-Fernandez, J. Fierrez, and J. Ortega-Garcia. 2012. Quality Measures in Biometric Systems. IEEE Security and Privacy 10, 6 (2012), 52–62.
- Alonso-Fernandez et al. (2021) F. Alonso-Fernandez, K. Hernandez-Diaz, S. Ramis, F. J. Perales, and J. Bigun. 2021. Facial masks and soft-biometrics: Leveraging face recognition CNNs for age and gender prediction on mobile ocular images. IET Biometrics 10, 5 (2021).
- Anisetti et al. (2019) M. Anisetti, Y.-H. Li, P.-J. Huang, and Y. Juan. 2019. An Efficient and Robust Iris Segmentation Algorithm Using Deep Learning. Mobile Information Systems (2019), 4568929.
- Arora and Bhatia (2020) S. Arora and M.P.S. Bhatia. 2020. Presentation attack detection for iris recognition using deep learning. Int J Syst Assur Eng Manag 11 (2020), 232–238.
- Badrinarayanan et al. (2017) V. Badrinarayanan, A. Kendall, and R. Cipolla. 2017. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 12 (2017), 2481–2495.
- Banerjee et al. (2022) A. Banerjee, C. Ghosh, and S. N. Mandal. 2022. Analysis of V-Net Architecture for Iris Segmentation in Unconstrained Scenarios. SN Computer Science 3 (2022).
- Bigdeli et al. (2021) B. Bigdeli, P. Pahlavani, and H. A. Amirkolaee. 2021. An ensemble deep learning method as data fusion system for remote sensing multisensor classification. Applied Soft Computing 110 (2021), 107563.
- Bolme et al. (2016) D. S. Bolme, R. A. Tokola, C. B. Boehnen, T. B. Saul, K. A. Sauerwein, and D. W. Steadman. 2016. Impact of environmental factors on biometric matching during human decomposition. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). IEEE, USA, 1–8.
- Boutros et al. (2020) F. Boutros, N. Damer, K. Raja, R. Ramachandra, F. Kirchbuchner, and A. Kuijper. 2020. On Benchmarking Iris Recognition within a Head-mounted Display for AR/VR Applications. In IEEE Int. Joint Conf. on Biometrics (IJCB). USA.
- Boyd et al. (2022) A. Boyd, K. Bowyer, and A. Czajka. 2022. Human-Aided Saliency Maps Improve Generalization of Deep Learning. In IEEE Winter Conference on Applications of Computer Vision (WACV).
- Boyd et al. (2019) A. Boyd, A. Czajka, and K. Bowyer. 2019. DL-Based Feature Extraction in Iris Recognition: Use Existing Models, Fine-tune or Train From Scratch?. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). 1–9.
- Boyd et al. (2020a) A. Boyd, Z. Fang, A. Czajka, and K. Bowyer. 2020a. Iris presentation attack detection: Where are we now? Pattern Recognition Letters 138 (2020), 483–489.
- Boyd et al. (2021) A. Boyd, P. Tinsley, K. Bowyer, and A. Czajka. 2021. CYBORG: Blending Human Saliency Into the Loss Improves Deep Learning. arXiv:2112.00686 [cs.CV]
- Boyd et al. (2020b) A. Boyd, S. Yadav, T. Swearingen, A. Kuehlkamp, M. Trokielewicz, E. Benjamin, P. Maciejewicz, D. Chute, A. Ross, P. Flynn, K. Bowyer, and A. Czajka. 2020b. Post-Mortem Iris Recognition—A Survey and Assessment of the State of the Art. IEEE Access 8 (2020), 136570–136593.
- Chen and Ross (2021) C. Chen and A. Ross. 2021. An Explainable Attention-Guided Iris Presentation Attack Detector. In IEEE Workshop on Applications of Computer Vision (WACV). 97–106.
- Chen et al. (2016) J. Chen, F. Shen, D. Z. Chen, and P. Flynn. 2016. Iris Recognition Based on Human-Interpretable Features. IEEE Transactions on Information Forensics and Security 11, 7 (2016), 1476–1485.
- Chen et al. (2017) L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. Yuille. 2017. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 4 (2017).
- Chen et al. (2019a) Y. Chen, W. Wang, Z. Zeng, and Y. Wang. 2019a. An Adaptive CNNs Technology for Robust Iris Segmentation. IEEE Access 7 (2019), 64517–64532.
- Chen et al. (2020) Y. Chen, C. Wu, and Y. Wang. 2020. T-Center: A Novel Feature Extraction Approach Towards Large-Scale Iris Recognition. IEEE Access 8 (2020), 32365–32375.
- Chen et al. (2019b) Y. Chen, Z. Zeng, and F. Hu. 2019b. End to End Robust Recognition Method for Iris Using a Dense Deep Convolutional Neural Network. In Biometric Recognition. Springer, Cham, 364–375.
- Choudhary et al. (2021) M. Choudhary, V. Tiwari, and Venkanna U. 2021. Ensuring Secured Iris Authentication for Mobile Devices. In IEEE International Conference on Consumer Electronics (ICCE). 1–5.
- Choudhary et al. (2022a) M. Choudhary, V. Tiwari, and U. Venkanna. 2022a. Iris Liveness Detection Using Fusion of Domain-Specific Multiple BSIF and DenseNet Features. IEEE Transactions on Cybernetics 52, 4 (2022), 2370–2381.
- Choudhary et al. (2022b) M. Choudhary, Tiwari V., and Venkanna U. 2022b. Identifying discriminatory feature-vectors for fusion-based iris liveness detection. J Ambient Intell Human Comput (2022).
- Czajka (2013) A. Czajka. 2013. Database of iris printouts and its application: Development of liveness detection method for iris recognition. In International Conference on Methods and Models in Automation and Robotics (MMAR). 28–33.
- Czajka (2021) A. Czajka. 2021. Iris recognition designed for post-mortem and diseased eyes. (2021). https://github.com/aczajka/iris-recognition---pm-diseased-human-driven-bsif
- Czajka and Bowyer (2018) A. Czajka and K. Bowyer. 2018. Presentation Attack Detection for Iris Recognition: An Assessment of the State-of-the-Art. ACM Comput. Surv. 51, 4, Article 86 (Jul 2018), 35 pages.
- Czajka et al. (2019) A. Czajka, Z. Fang, and K. Bowyer. 2019. Iris Presentation Attack Detection Based on Photometric Stereo Features. In IEEE Winter Conference on Applications of Computer Vision (WACV). 877–885.
- Damer et al. (2019) N. Damer, K. Dimitrov, A. Braun, and A. Kuijper. 2019. On Learning Joint Multi-biometric Representations by Deep Fusion. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). 1–8.
- Das et al. (2020) P. Das, J. Mcfiratht, Z. Fang, A. Boyd, G. Jang, A. Mohammadi, S. Purnapatra, D. Yambay, S. Marcel, M. Trokielewicz, P. Maciejewicz, K. Bowyer, A. Czajka, S. Schuckers, J. Tapia, S. Gonzalez, M. Fang, N. Damer, F. Boutros, A. Kuijper, R. Sharma, C. Chen, and A. Ross. 2020. Iris Liveness Detection Competition (LivDet-Iris) - The 2020 Edition. In IEEE Int. Joint Conf. on Biometrics (IJCB). 1–9.
- Daugman (1993) J. Daugman. 1993. High confidence visual recognition of persons by a test of statistical independence. IEEE Trans. Pattern Anal. Mach. Intell. 15, 11 (1993), 1148–1161.
- Daugman (2007) J. Daugman. 2007. New methods in iris recognition. IEEE Transactions on Systems, Man and Cybernetics 37 (2007), 1167–1175.
- Daugman (2016) J. Daugman. 2016. Information Theory and the IrisCode. IEEE Transactions on Information Forensics and Security 11 (2016), 400–409.
- Daugman (2021) J. Daugman. 2021. Collision Avoidance on National and Global Scales: Understanding and Using Big Biometric Entropy. TechRxiv (2021).
- De Marsico et al. (2015) M. De Marsico, M. Nappi, D. Riccio, and H. Wechsler. 2015. Mobile Iris Challenge Evaluation (MICHE)-I, biometric iris dataset and protocols. Pattern Recognition Letters 57 (2015), 17–23. Mobile Iris CHallenge Evaluation part I (MICHE I).
- Ding et al. (2022) X. Ding, H. Fang, Z. Zhang, K.-K. R. Choo, and H. Jin. 2022. Privacy-Preserving Feature Extraction via Adversarial Training. IEEE Transactions on Knowledge and Data Engineering 34, 4 (2022), 1967–1979.
- Dong et al. (2009) W. Dong, Z. Sun, and T. Tan. 2009. A Design of Iris Recognition System at a Distance. In Chinese Conference on Pattern Recognition. 1–5.
- Doyle and Bowyer (2015) J. S. Doyle and K. Bowyer. 2015. Robust Detection of Textured Contact Lenses in Iris Recognition Using BSIF. IEEE Access 3 (2015), 1672–1683.
- Doyle et al. (2013) J. S. Doyle, K. Bowyer, and P. Flynn. 2013. Variation in accuracy of textured contact lens detection based on sensor and lens pattern. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). 1–7.
- Elsken et al. (2019) T. Elsken, J. H. Metzen, and F. Hutter. 2019. Neural architecture search: A survey. J. Mach. Learn. Res. 20, 55 (2019).
- Fang et al. (2020a) M. Fang, N. Damer, Fadi Boutros, F. Kirchbuchner, and A. Kuijper. 2020a. Deep Learning Multi-layer Fusion for an Accurate Iris Presentation Attack Detection. In IEEE International Conference on Information Fusion (FUSION). 1–8.
- Fang et al. (2021b) M. Fang, N. Damer, F. Boutros, F. Kirchbuchner, and A. Kuijper. 2021b. Cross-database and cross-attack Iris presentation attack detection using micro stripes analyses. Image and Vision Computing 105 (2021), 104057.
- Fang et al. (2021c) M. Fang, N. Damer, F. Boutros, F. Kirchbuchner, and A. Kuijper. 2021c. Iris Presentation Attack Detection by Attention-based and Deep Pixel-wise Binary Supervision Network. In IEEE Int. Joint Conf. on Biometrics (IJCB). 1–8.
- Fang et al. (2021d) M. Fang, N. Damer, F. Boutros, F. Kirchbuchner, and A. Kuijper. 2021d. The overlapping effect and fusion protocols of data augmentation techniques in iris PAD. Machine Vision and Applications 33 (2021).
- Fang et al. (2020b) M. Fang, N. Damer, F. Kirchbuchner, and A. Kuijper. 2020b. Micro Stripes Analyses for Iris Presentation Attack Detection. In IEEE Int. Joint Conf. on Biometrics (IJCB). 1–10.
- Fang et al. (2021e) M. Fang, N. Damer, F. Kirchbuchner, and A. Kuijper. 2021e. Demographic Bias in Presentation Attack Detection of Iris Recognition Systems. In 28th European Signal Processing Conference (EUSIPCO). 835–839.
- Fang and Czajka (2020) Z. Fang and A. Czajka. 2020. Open Source Iris Recognition Hardware and Software with Presentation Attack Detection. In IEEE Int. Joint Conf. on Biometrics (IJCB). 1–8.
- Fang et al. (2021a) Z. Fang, A. Czajka, and K. Bowyer. 2021a. Robust Iris Presentation Attack Detection Fusing 2D and 3D Information. IEEE Transactions on Information Forensics and Security 16 (2021), 510–520.
- FBI Criminal Justice Information Services (CJIS) Division (2021) FBI Criminal Justice Information Services (CJIS) Division. 2021. Next Generation Identification (NGI). Retrieved 2021 from https://www.fbi.gov/services/cjis/fingerprints-and-other-biometrics/ngi
- Fierrez et al. (2007) J. Fierrez, J. Ortega-Garcia, D. Torre Toledano, and J. Gonzalez-Rodriguez. 2007. Biosec baseline corpus: A multimodal biometric database. Pattern Recognition 40, 4 (2007), 1389–1392.
- Galbally and Gomez-Barrero (2016) J. Galbally and M. Gomez-Barrero. 2016. A review of iris anti-spoofing. In 4th International Conference on Biometrics and Forensics (IWBF). 1–6.
- Ganeeva and Myasnikov (2020) Y. Ganeeva and E. Myasnikov. 2020. Using Convolutional Neural Networks for Segmentation of Iris Images. In International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon). 1–4.
- Gangwar and Joshi (2016) A. Gangwar and A. Joshi. 2016. DeepIrisNet: Deep iris representation with applications in iris recognition and cross-sensor iris recognition. In IEEE International Conference on Image Processing. 2301–2305.
- Gangwar et al. (2019) A. Gangwar, A. Joshi, P. Joshi, and R. Raghavendra. 2019. DeepIrisNet2: Learning Deep-IrisCodes from Scratch for Segmentation-Robust Visible Wavelength and Near Infrared Iris Recognition. CoRR abs/1902.05390 (2019).
- Garbin et al. (2019) S. J. Garbin, Y. Shen, I. Schuetz, R. Cavin, G. Hughes, and S. S. Talathi. 2019. OpenEDS: Open Eye Dataset. CoRR abs/1905.03702 (2019). arXiv:1905.03702
- Gautam et al. (2022) G. Gautam, A. Raj, and S. Mukhopadhyay. 2022. Deep supervised class encoding for iris presentation attack detection. Digital Signal Processing 121 (2022), 103329.
- Goodfellow et al. (2014) I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative Adversarial Nets. In International Conference on Neural Information Processing Systems (NIPS) (Canada). USA, 9.
- Gragnaniello et al. (2016) D. Gragnaniello, C. Sansone, G. Poggi, and L. Verdoliva. 2016. Biometric Spoofing Detection by a Domain-Aware Convolutional Neural Network. In International Conference on Signal-Image Technology and Internet-Based Systems (SITIS).
- Grother and Tabassi (2007) P. Grother and E. Tabassi. 2007. Performance of Biometric Quality Measures. IEEE Trans. Pattern Anal. Mach. Intell. 29 (2007), 531–543.
- Gupta et al. (2021) M. Gupta, S. Singh, A. Agarwal, M. Vatsa, and R. Singh. 2021. Generalized Iris Presentation Attack Detection Algorithm under Cross-Database Settings. In Int. Conf. on Pattern Recognition (ICPR). 5318–5325.
- Gupta et al. (2014) P. Gupta, S. Behera, M. Vatsa, and R. Singh. 2014. On Iris Spoofing Using Print Attack. In Int. Conf. on Pattern Recognition (ICPR). 1681–1686.
- Hafner et al. (2021) A. Hafner, P. Peer, Ž. Emeršič, and M. Vitek. 2021. Deep Iris Feature Extraction. In International Conference on Artificial Intelligence in Information and Communication (ICAIIC). 258–262.
- Han (2021) Y. Han. 2021. Design of An Active Infrared Iris Recognition Device. In IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC). 435–437.
- He et al. (2022) Y. He, G. Meng, K. Chen, X. Hu, and J. He. 2022. Towards Security Threats of Deep Learning Systems: A Survey. IEEE Transactions on Software Engineering 48, 5 (2022), 1743–1770.
- He et al. (2009) Z. He, T. Tan, Z. Sun, and X. Qiu. 2009. Toward Accurate and Fast Iris Segmentation for Iris Biometrics. IEEE Trans. Pattern Anal. Mach. Intell. 31, 9 (2009), 1670–1684.
- Hedman et al. (2021) P. Hedman, V. Skepetzis, K. Hernandez-Diaz, J. Bigün, and F. Alonso-Fernandez. 2021. On the Effect of Selfie Beautification Filters on Face Detection and Recognition. CoRR abs/2110.08934 (2021). arXiv:2110.08934
- Hernandez-Diaz et al. (2018) K. Hernandez-Diaz, F. Alonso-Fernandez, and J. Bigun. 2018. Periocular Recognition Using CNN Features Off-the-Shelf. In International Conference of the Biometrics Special Interest Group (BIOSIG). 1–5.
- Hernandez-Diaz et al. (2020) K. Hernandez-Diaz, F. Alonso-Fernandez, and J. Bigun. 2020. Cross-Spectral Periocular Recognition with Conditional Adversarial Networks. In IEEE Int. Joint Conf. on Biometrics (IJCB). 1–9.
- Hofbauer et al. (2019) H. Hofbauer, E. Jalilian, and A. Uhl. 2019. Exploiting superior CNN-based iris segmentation for better recognition accuracy. Pattern Recognition Letters 120 (2019), 17–23.
- Howard et al. (2019) A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, and H. Adam. 2019. Searching for MobileNetV3. In IEEE Int. Conf. on Computer Vision (ICCV).
- Hsiao and Fan (2021) C.-S. Hsiao and C.-P. Fan. 2021. EfficientNet Based Iris Biometric Recognition Methods with Pupil Positioning by U-Net. In 3rd International Conference on Computer Communication and the Internet (ICCCI). 1–5.
- Hu et al. (2022) L. Hu, J. Li, G. Lin, S. Peng, Z. Zhang, Y. Zhang, and C. Dong. 2022. Defending against Membership Inference Attacks with High Utility by GAN. IEEE Transactions on Dependable and Secure Computing (2022), 1–1.
- Huynh et al. (2019) V. T. Huynh, S.-H. Kim, G.-S. Lee, and H.-J. Yang. 2019. Eye Semantic Segmentation with A Lightweight Model. In IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). 3694–3697.
- Hwang and Lee (2020) H. Hwang and E. C. Lee. 2020. Near-Infrared Image-Based Periocular Biometric Method Using Convolutional Neural Network. IEEE Access 8 (2020), 158612–158621.
- Jain et al. (2021) A. Jain, D. Deb, and J. Engelsma. 2021. Biometrics: Trust, but Verify. IEEE Transactions on Biometrics, Behavior, and Identity Science (2021), 1–1.
- Jalilian et al. (2020) E. Jalilian, M. Karakaya, and A. Uhl. 2020. End-to-end Off-angle Iris Recognition Using CNN Based Iris Segmentation. In International Conference of the Biometrics Special Interest Group (BIOSIG). 1–7.
- Jalilian et al. (2022) E. Jalilian, G. Wimmer, A. Uhl, and M. Karakaya. 2022. Deep Learning Based Off-Angle Iris Recognition. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). 4048–4052.
- Johnson et al. (2010) P. A. Johnson, P. Lopez-Meyer, N. Sazonova, F. Hua, and S. Schuckers. 2010. Quality in face and iris research ensemble (Q-FIRE). In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). 1–6.
- Jung et al. (2020) Y. G. Jung, C. Y. Low, J. Park, and A. B. J. Teoh. 2020. Periocular Recognition in the Wild With Generalized Label Smoothing Regularization. IEEE Signal Processing Letters 27 (2020), 1455–1459.
- Kaur and Manduchi (2020) H. Kaur and R. Manduchi. 2020. EyeGAN: Gaze–Preserving, Mask–Mediated Eye Image Synthesis. In IEEE Winter Conference on Applications of Computer Vision (WACV). 299–308.
- Kaur and Manduchi (2021) H. Kaur and R. Manduchi. 2021. Subject Guided Eye Image Synthesis with Application to Gaze Redirection. In IEEE Winter Conference on Applications of Computer Vision (WACV). 11–20.
- Kerrigan et al. (2019) D. Kerrigan, M. Trokielewicz, A. Czajka, and K. Bowyer. 2019. Iris Recognition with Image Segmentation Employing Retrained Off-the-Shelf Deep Neural Networks. In IEEE Int. Conf. on Biometrics (ICB). 1–7.
- Khalid et al. (2019) F. Khalid, M. A. Hanif, S. Rehman, R. Ahmed, and M. Shafique. 2019. TrISec: Training Data-Unaware Imperceptible Security Attacks on Deep Neural Networks. In IEEE 25th International Symposium on On-Line Testing and Robust System Design (IOLTS). 188–193.
- Khosravy et al. (2022) M. Khosravy, K. Nakamura, Y. Hirose, N. Nitta, and N. Babaguchi. 2022. Model Inversion Attack by Integration of Deep Generative Models: Privacy-Sensitive Face Generation From a Face Recognition System. IEEE Transactions on Information Forensics and Security 17 (2022), 357–372.
- Kim et al. (2018) M. C. Kim, J. H. Koo, S. W. Cho, N. R. Baek, and K. R. Park. 2018. Convolutional Neural Network-Based Periocular Recognition in Surveillance Environments. IEEE Access 6 (2018), 57291–57310.
- Kohli et al. (2013) N. Kohli, D. Yadav, M. Vatsa, and R. Singh. 2013. Revisiting iris recognition with color cosmetic contact lenses. In IEEE Int. Conf. on Biometrics (ICB). 1–7.
- Kohli et al. (2017) N. Kohli, D. Yadav, M. Vatsa, R. Singh, and A. Noore. 2017. Synthetic iris presentation attack using iDCGAN. In IEEE Int. Joint Conf. on Biometrics (IJCB). 674–680.
- Kuehlkamp et al. (2022) A. Kuehlkamp, A. Boyd, A. Czajka, K. Bowyer, P. Flynn, D. Chute, and E. Benjamin. 2022. Interpretable Deep Learning-Based Forensic Iris Segmentation and Recognition. In 2nd WACV Workshop on Explainable and Interpretable Artificial Intelligence for Biometrics (XAI4B). IEEE, USA, 1–8.
- Kumar and Passi (2010) A. Kumar and A. Passi. 2010. Comparison and combination of iris matchers for reliable personal authentication. Pattern Recognition 43, 3 (2010), 1016–1026.
- Ledig et al. (2017) C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi. 2017. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR). 105–114.
- Li et al. (2021) Y.-H. Li, W. R. Putri, M. S. Aslam, and C.-C. Chang. 2021. Robust Iris Segmentation Algorithm in Non-Cooperative Environments Using Interleaved Residual U-Net. Sensors 21, 4 (2021).
- Liang et al. (2021) T. Liang, J. Glossner, L. Wang, S. Shi, and X. Zhang. 2021. Pruning and quantization for deep neural network acceleration: A survey. Neurocomputing 461 (2021), 370–403.
- Lin et al. (2016) G. Lin, A. Milan, C. Shen, and I. Reid. 2016. RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation. arXiv:1611.06612 [cs.CV]
- Liu et al. (2016b) N. Liu, M. Zhang, H. Li, Z. Sun, and T. Tan. 2016b. DeepIris: Learning pairwise filter bank for heterogeneous iris verification. Pattern Recognition Letters 82 (2016), 154–161.
- Liu et al. (2016a) W. Liu, A. Rabinovich, and A. Berg. 2016a. ParseNet: Looking Wider to See Better. In International Conference on Learning Representations (ICLR).
- Liu et al. (2019) X. Liu, Y. Bai, Y. Luo, Z. Yang, and Y. Liu. 2019. Iris recognition in visible spectrum based on multi-layer analogous convolution and collaborative representation. Pattern Recognition Letters 117 (2019), 66–73.
- Long et al. (2015) J. Long, E. Shelhamer, and T. Darrell. 2015. Fully convolutional networks for semantic segmentation. In IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR). 3431–3440.
- Lozej et al. (2018) J. Lozej, B. Meden, V. Struc, and P. Peer. 2018. End-to-End Iris Segmentation Using U-Net. In IEEE International Work Conference on Bioinspired Intelligence (IWOBI). 1–6.
- Luo et al. (2021) Z. Luo, J. Li, and Y. Zhu. 2021. A Deep Feature Fusion Network Based on Multiple Attention Mechanisms for Joint Iris-Periocular Biometric Recognition. IEEE Signal Processing Letters 28 (2021), 1060–1064.
- Ma et al. (2003) L. Ma, T. Tan, Y. Wang, and D. Zhang. 2003. Personal identification based on iris texture analysis. IEEE Trans. Pattern Anal. Mach. Intell. 25, 12 (2003), 1519–1533.
- Menotti et al. (2015) D. Menotti, G. Chiachia, A. Pinto, W. R. Schwartz, H. Pedrini, A. X. Falcão, and A. Rocha. 2015. Deep Representations for Iris, Face, and Fingerprint Spoofing Detection. IEEE Transactions on Information Forensics and Security 10, 4 (2015).
- Minaee et al. (2016) S. Minaee, A. Abdolrashidiy, and Y. Wang. 2016. An experimental study of deep convolutional features for iris recognition. IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 1–6.
- Mishra et al. (2019) S. Mishra, P. Liang, A. Czajka, Danny Z. Chen, and X. Hu. 2019. CC-NET: Image Complexity Guided Network Compression for Biomedical Image Segmentation. In IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). 57–60.
- Monro et al. (2007) D. Monro, S. Rakshit, and D. Zhang. 2007. DCT-Based Iris Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 29, 4 (2007), 586–595.
- Moreira et al. (2019) D. Moreira, M. Trokielewicz, A. Czajka, K. Bowyer, and P. Flynn. 2019. Performance of Humans in Iris Recognition: The Impact of Iris Condition and Annotation-Driven Verification. In IEEE Winter Conference on Applications of Computer Vision (WACV). 941–949.
- Mostofa et al. (2021) M. Mostofa, S. Mohamadi, J. Dawson, and N. Nasrabadi. 2021. Deep GAN-Based Cross-Spectral Cross-Resolution Iris Recognition. IEEE Transactions on Biometrics, Behavior, and Identity Science 3, 4 (2021), 443–463.
- Muron and Pospisil (2000) A. Muron and J. Pospisil. 2000. The human iris structure and its usages. In Acta Univ Plalcki Physica. Vol. 39. Acta Universitatis, 87–95.
- Nguyen et al. (2017a) K. Nguyen, C. Fookes, R. Jillela, S. Sridharan, and A. Ross. 2017a. Long range iris recognition: A survey. Pattern Recognition 72 (2017), 123–143.
- Nguyen et al. (2017b) K. Nguyen, C. Fookes, A. Ross, and S. Sridharan. 2017b. Iris recognition with Off-the-Shelf CNN Features: A deep learning perspective. IEEE Access 6 (2017), 18848–18855. Invited Paper.
- Nguyen et al. (2020) K. Nguyen, C. Fookes, and S. Sridharan. 2020. Constrained Design of Deep Iris Networks. IEEE Transactions on Image Processing 29 (2020), 7166–7175.
- Nguyen et al. (2011) K. Nguyen, C. Fookes, S. Sridharan, and S. Denman. 2011. Quality-Driven Super-Resolution for Less Constrained Iris Recognition at a Distance and on the Move. IEEE Transactions on Information Forensics and Security 6, 4 (2011).
- Nguyen et al. (2022) K. Nguyen, C. Fookes, S. Sridharan, and A. Ross. 2022. Complex-valued Iris Recognition Network. IEEE Trans. Pattern Anal. Mach. Intell. (2022).
- Nguyen et al. (2018) K. Nguyen, C. Fookes, S. Sridharan, M. Tistarelli, and M. Nixon. 2018. Super-resolution for biometrics: A comprehensive survey. Pattern Recognition 78 (2018), 23–42.
- Nguyen et al. (2012) K. Nguyen, S. Sridharan, S. Denman, and C. Fookes. 2012. Feature-domain super-resolution framework for Gabor-based face and iris recognition. In IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR). 2642–2649.
- Nie et al. (2014) L. Nie, A. Kumar, and S. Zhan. 2014. Periocular Recognition Using Unsupervised Convolutional RBM Feature Learning. In Int. Conf. on Pattern Recognition (ICPR). 399–404.
- Nigam et al. (2015) I. Nigam, M. Vatsa, and R. Singh. 2015. Ocular biometrics: A survey of modalities and fusion approaches. Information Fusion 26 (2015), 1–35.
- NIST (2021) NIST. 2021. IEG: Iris Examiner Training Discussion: https://www.nist.gov/itl/iad/image-group/ieg-iris-examiner-training-discussion.
- Odinokikh et al. (2019) G. Odinokikh, M. Korobkin, I. Solomatin, I. Efimov, and A. Fartukov. 2019. Iris Feature Extraction and Matching Method for Mobile Biometric Applications. In IEEE Int. Conf. on Biometrics (ICB). 1–6.
- Omelina et al. (2021) L. Omelina, J. Goga, J. Pavlovicova, M. Oravec, and B. Jansen. 2021. A survey of iris datasets. Image and Vision Computing 108 (2021), 104109.
- Ortega et al. (2010) J. Ortega, J. Fierrez, F. Alonso, J. Galbally, M. R. Freire, J. Gonzalez, C. Garcia, J. Alba, E. Gonzalez-Agulla, E. Otero, S. Garcia-Salicetti, L. Allano, B. Ly-Van, B. Dorizzi, J. Kittler, T. Bourlai, N. Poh, F. Deravi, Ming N. R. Ng, M. Fairhurst, J. Hennebert, A. Humm, M. Tistarelli, L. Brodo, J. Richiardi, A. Drygajlo, H. Ganster, F. M. Sukno, S. Pavani, A. Frangi, L. Akarun, and A. Savran. 2010. The Multiscenario Multienvironment BioSecure Multimodal Database (BMDB). IEEE Trans. Pattern Anal. Mach. Intell. 32, 6 (2010), 1097–1111.
- Park et al. (2011) U. Park, R. Jillela, A. Ross, and A. Jain. 2011. Periocular Biometrics in the Visible Spectrum. IEEE Transactions on Information Forensics and Security 6, 1 (2011), 96–106.
- Parzianello and Czajka (2022) L. Parzianello and A. Czajka. 2022. Saliency-Guided Textured Contact Lens-Aware Iris Recognition. In IEEE Workshop on Applications of Computer Vision (WACV). 330–337.
- Peng et al. (2020) H. Peng, B. Li, D. He, and J. Wang. 2020. End-to-End Anti-Attack Iris Location Based on Lightweight Network. In IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA). 821–827.
- Phillips et al. (2010) P. J. Phillips, W. T. Scruggs, A. J. O’Toole, P. Flynn, K. Bowyer, C. L. Schott, and M. Sharpe. 2010. FRVT 2006 and ICE 2006 Large-Scale Experimental Results. IEEE Trans. Pattern Anal. Mach. Intell. 32, 5 (2010), 831–846.
- Planet Biometrics (2017) Planet Biometrics. 2017. Homeland Advanced Recognition Technology (HART). http://www.planetbiometrics.com/article-details/i/5598/desc/dhs-launches-rfp-for-hart/
- Proença (2013) H. Proença. 2013. Iris Recognition in the Visible Wavelength. Springer, UK, 151–169.
- Proença and Alexandre (2005) H. Proença and L. Alexandre. 2005. UBIRIS: A Noisy Iris Image Database. In Image Analysis and Processing – ICIAP 2005. Springer, Germany, 970–977.
- Proenca et al. (2010) H. Proenca, S. Filipe, R. Santos, J. Oliveira, and L. Alexandre. 2010. The UBIRIS.v2: A Database of Visible Wavelength Iris Images Captured On-the-Move and At-a-Distance. IEEE Trans. Pattern Anal. Mach. Intell. 32, 8 (2010), 1529–1535.
- Proença and Neves (2017) H. Proença and J. C. Neves. 2017. IRINA: Iris Recognition (Even) in Inaccurately Segmented Data. In IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR). 6747–6756.
- Proença (2010) H. Proença. 2010. Iris Recognition: On the Segmentation of Degraded Images Acquired in the Visible Wavelength. IEEE Trans. Pattern Anal. Mach. Intell. 32, 8 (2010), 1502–1516.
- Proença and Neves (2018) H. Proença and J. Neves. 2018. Deep-PRWIS: Periocular Recognition Without the Iris and Sclera Using Deep Learning Frameworks. IEEE Transactions on Information Forensics and Security 13, 4 (2018), 888–896.
- Proença and Neves (2019) H. Proença and J. C. Neves. 2019. Segmentation-Less and Non-Holistic Deep-Learning Frameworks for Iris Recognition. In IEEE Int. Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW). 2296–2305.
- Quinn, G. and Matey, J. and Grother, P. and Watters, E. (2022) Quinn, G. and Matey, J. and Grother, P. and Watters, E. 2022. Statistics of Visual Features in the Human Iris.
- Ribeiro et al. (2019) E. Ribeiro, A. Uhl, and F. Alonso-Fernandez. 2019. Iris super-resolution using CNNs: is photo-realism important to iris recognition? IET Biometrics 8 (2019), 69–78.
- Ribeiro et al. (2017) E. Ribeiro, A. Uhl, F. Alonso-Fernandez, and R. A. Farrugia. 2017. Exploring deep learning image super-resolution for iris recognition. In 25th European Signal Processing Conference (EUSIPCO). 2176–2180.
- Ronneberger et al. (2015) O. Ronneberger, P.Fischer, and T. Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) (LNCS, Vol. 9351). Springer, 234–241.
- Saleh (2018) I. Saleh. 2018. Internet of Things (IoT): Concepts, Issues, Challenges and Perspectives. 1–26.
- Sansola (2015) A. Sansola. 2015. Postmortem iris recognition and its application in human identification. Master’s thesis. Boston University, USA.
- Sardar et al. (2020) M. Sardar, S. Banerjee, and S. Mitra. 2020. Iris Segmentation Using Interactive Deep Learning. IEEE Access 8 (2020), 219322–219330.
- Sauerwein et al. (2017) K. Sauerwein, T. B. Saul, D. W. Steadman, and C. B. Boehnen. 2017. The Effect of Decomposition on the Efficacy of Biometrics for Positive Identification. Journal of Forensic Sciences 62, 6 (2017), 1599–1602.
- Schlett et al. (2018) T. Schlett, C. Rathgeb, and C. Busch. 2018. Multi-spectral Iris Segmentation in Visible Wavelengths. In IEEE Int. Conf. on Biometrics (ICB). 190–194.
- Schroff et al. (2015) F. Schroff, D. Kalenichenko, and J. Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR). 815–823.
- Sequeira et al. (2014) A. Sequeira, H. Oliveira, J. Monteiro, J. Monteiro, and J. Cardoso. 2014. MobILive 2014 - Mobile Iris Liveness Detection Competition. In IEEE Int. Joint Conf. on Biometrics (IJCB). 1–6.
- Shah and Ross (2009) S. Shah and A. Ross. 2009. Iris Segmentation Using Geodesic Active Contours. IEEE Transactions on Information Forensics and Security 4, 4 (2009), 824–836.
- Sharma et al. (2014) A. Sharma, S. Verma, M. Vatsa, and R. Singh. 2014. On cross spectral periocular recognition. In Int. IEEE Int. Conf. on Image Processing. 5007–5011.
- Sharma and Selwal (2021) D. Sharma and A. Selwal. 2021. On Data-Driven Approaches for Presentation Attack Detection in Iris Recognition Systems. In Recent Innovations in Computing. Springer, Singapore, 463–473.
- Sharma and Ross (2020) R. Sharma and A. Ross. 2020. D-NetPAD: An Explainable and Interpretable Iris Presentation Attack Detector. In IEEE Int. Joint Conf. on Biometrics (IJCB). 1–10.
- Sharma and Ross (2021) R. Sharma and A. Ross. 2021. Viability of Optical Coherence Tomography for Iris Presentation Attack Detection. In Int. Conf. on Pattern Recognition (ICPR). 6165–6172.
- Shen and Flynn (2013) F. Shen and P. Flynn. 2013. Using crypts as iris minutiae. In Biometric and Surveillance Technology for Human and Activity Identification X, Vol. 8712. 87120B.
- Song et al. (2019) L. Song, R. Shokri, and P. Mittal. 2019. Membership Inference Attacks Against Adversarially Robust Deep Learning Models. In IEEE Security and Privacy Workshops (SPW). 50–56.
- Stark et al. (2010) L. Stark, K. Bowyer, and S. Siena. 2010. Human perceptual categorization of iris texture patterns. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). 1–7.
- Sun et al. (2014) Z. Sun, H. Zhang, T. Tan, and J. Wang. 2014. Iris Image Classification Based on Hierarchical Visual Codebook. IEEE Trans. Pattern Anal. Mach. Intell. 36, 6 (2014), 1120–1133.
- Sunder and Ross (2010) M. S. Sunder and A. Ross. 2010. Iris Image Retrieval Based on Macro-features. In Int. Conf. on Pattern Recognition (ICPR). 1318–1321.
- Tabassi et al. (2011) E. Tabassi, P. Grother, and W. Salamon. 2011. IREX II - IQCE - Iris Quality Calibration and Evaluation. Performance of Iris Image Quality Assessment Algorithms. NISTIR 7296 - http://iris.nist.gov/irex/ (2011).
- Talreja et al. (2022) V. Talreja, N. M. Nasrabadi, and M. C. Valenti. 2022. Attribute-Based Deep Periocular Recognition: Leveraging Soft Biometrics to Improve Periocular Recognition. In IEEE Winter Conference on Applications of Computer Vision (WACV).
- Tan and Kumar (2012) C.-W. Tan and A. Kumar. 2012. Unified Framework for Automated Iris Segmentation Using Distantly Acquired Face Images. IEEE Transactions on Image Processing 21, 9 (2012), 4068–4079.
- Tan et al. (2010) T. Tan, Zh. He, and Z. Sun. 2010. Efficient and robust segmentation of noisy iris images for non-cooperative iris recognition. Image and Vision Computing 28, 2 (2010), 223–230.
- Tann et al. (2019) H. Tann, H. Zhao, and S. Reda. 2019. A Resource-Efficient Embedded Iris Recognition System Using Fully Convolutional Networks. ACM Journal on Emerging Technologies in Computing Systems 16, 1 (2019), 1–23.
- Tapia et al. (2022) J. Tapia, S. Gonzalez, and C. Busch. 2022. Iris Liveness Detection Using a Cascade of Dedicated Deep Learning Networks. IEEE Transactions on Information Forensics and Security 17 (2022), 42–52.
- technology — Biometric presentation attack detection — Part 1: Framework (2016) ISO/IEC 30107-1. Information technology — Biometric presentation attack detection — Part 1: Framework. 2016.
- technology — Biometric sample quality — Part 6: Iris image data (2015) ISO/IEC 29794-6. Information technology — Biometric sample quality — Part 6: Iris image data. 2015.
- Tinsley et al. (2021) P. Tinsley, A. Czajka, and P. Flynn. 2021. This Face Does Not Exist… But It Might Be Yours! Identity Leakage in Generative Models. In IEEE Winter Conference on Applications of Computer Vision (WACV). 1319–1327.
- Trokielewicz and Czajka (2018) M. Trokielewicz and A. Czajka. 2018. Data-driven segmentation of post-mortem iris images. In International Workshop on Biometrics and Forensics (IWBF). IEEE, Italy, 1–7.
- Trokielewicz et al. (2016a) M. Trokielewicz, A. Czajka, and P. Maciejewicz. 2016a. Human iris recognition in post-mortem subjects: Study and database. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). IEEE, USA, 1–6.
- Trokielewicz et al. (2016b) M. Trokielewicz, A. Czajka, and P. Maciejewicz. 2016b. Post-mortem human iris recognition. In IEEE Int. Conf. on Biometrics (ICB). IEEE, Sweden, 1–6.
- Trokielewicz et al. (2018) M. Trokielewicz, A. Czajka, and P. Maciejewicz. 2018. Presentation Attack Detection for Cadaver Iris. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). 1–10.
- Trokielewicz et al. (2019) M. Trokielewicz, A. Czajka, and P. Maciejewicz. 2019. Iris Recognition After Death. IEEE Transactions on Information Forensics and Security 14, 6 (June 2019), 1501–1514.
- Trokielewicz et al. (2020) M. Trokielewicz, A. Czajka, and P. Maciejewicz. 2020. Post-Mortem Iris Recognition Resistant to Biological Eye Decay Processes. In IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, USA, 1–8.
- Trokielewicz et al. (2020) M. Trokielewicz, A. Czajka, and P. Maciejewicz. 2020. Post-mortem iris recognition with deep-learning-based image segmentation. Image and Vision Computing 94 (2020), 103866.
- Tygert et al. (2016) M. Tygert, J. Bruna, S. Chintala, Y. LeCun, S. Piantino, and A. Szlam. 2016. A Mathematical Motivation for Complex-valued Convolutional Networks. Neural Computation 28 (2016), 815–825.
- Unique Identification Authority of India (2021) Unique Identification Authority of India. 2021. AADHAAR: http://uidai.gov.in.
- Vatsa et al. (2008) M. Vatsa, R. Singh, and A. Noore. 2008. Improving Iris Recognition Performance Using Segmentation, Quality Enhancement, Match Score Fusion, and Indexing. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38, 4 (2008), 1021–1035.
- Wang et al. (2020a) C. Wang, J. Muhammad, Y. Wang, Z. He, and Z. Sun. 2020a. Towards Complete and Accurate Iris Segmentation Using Deep Multi-Task Attention Network for Non-Cooperative Iris Recognition. IEEE Transactions on Information Forensics and Security 15 (2020), 2944–2959.
- Wang and Sun (2020) C. Wang and Z. Sun. 2020. A Benchmark for Iris Segmentation. Journal of Computer Research and Development 57, 2 (2020), 395–412.
- Wang et al. (2020b) C. Wang, Y. Wang, B. Xu, Y. He, Z. Dong, and Z. Sun. 2020b. A Lightweight Multi-Label Segmentation Network for Mobile Iris Biometrics. In IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP). 1006–1010.
- Wang et al. (2019b) C. Wang, Y. Zhu, Y. Liu, R. He, and Z. Sun. 2019b. Joint Iris Segmentation and Localization Using Deep Multi-task Learning Framework. arXiv:1901.11195 [cs.CV]
- Wang and Kumar (2019) K. Wang and A. Kumar. 2019. Toward More Accurate Iris Recognition Using Dilated Residual Features. IEEE Transactions on Information Forensics and Security 14, 12 (2019), 3233–3245.
- Wang and Kumar (2021) K. Wang and A. Kumar. 2021. Periocular-Assisted Multi-Feature Collaboration for Dynamic Iris Recognition. IEEE Transactions on Information Forensics and Security 16 (2021), 866–879.
- Wang et al. (2020c) L. Wang, K. Zhang, M. Ren, Y. Wang, and Z. Sun. 2020c. Recognition Oriented Iris Image Quality Assessment in the Feature Space. In IEEE Int. Joint Conf. on Biometrics (IJCB). 1–9.
- Wang et al. (2019a) X. Wang, H. Zhang, J. Liu, L. Xiao, Z. He, L. Liu, and P. Duan. 2019a. Iris Image Super Resolution Based on GANs with Adversarial Triplets. In Chinese Conference on Biometric Recognition (LNCS). Switzerland, 346 – 53.
- Wang et al. (2021) Z. Wang, J. Chai, and S. Xia. 2021. Realtime and Accurate 3D Eye Gaze Capture with DCNN-Based Iris and Pupil Segmentation. IEEE Transactions on Visualization and Computer Graphics 27, 1 (2021), 190–203.
- Wei et al. (2007) Z. Wei, T. Tan, and Z. Sun. 2007. Nonlinear Iris Deformation Correction Based on Gaussian Model. In Advances in Biometrics. Springer, Germany, 780–789.
- Wu and Zhao (2019) X. Wu and L. Zhao. 2019. Study on Iris Segmentation Algorithm Based on Dense U-Net. IEEE Access 7 (2019), 123959–123968.
- Xiao et al. (2013) L. Xiao, Z. Sun, R. He, and T. Tan. 2013. Coupled feature selection for cross-sensor iris recognition. In IEEE Int. Conf. on Biometrics: Theory Applications and Systems (BTAS). 1–6.
- Y. and K. (2016) Fisher Y. and Vladlen K. 2016. Multi-Scale Context Aggregation by Dilated Convolutions. arXiv:1511.07122 [cs.CV]
- Yadav et al. (2019b) D. Yadav, N. Kohli, M. Vatsa, R. Singh, and A. Noore. 2019b. Detecting Textured Contact Lens in Uncontrolled Environment Using DensePAD. In IEEE Int. Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW).
- Yadav et al. (2018) D. Yadav, N. Kohli, S. Yadav, M. Vatsa, R. Singh, and A. Noore. 2018. Iris Presentation Attack via Textured Contact Lens in Unconstrained Environment. In IEEE Winter Conference on Applications of Computer Vision (WACV). 503–511.
- Yadav et al. (2019a) S. Yadav, C. Chen, and A. Ross. 2019a. Synthesizing Iris Images Using RaSGAN With Application in Presentation Attack Detection. In IEEE Int. Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW). 2422–2430.
- Yadav et al. (2020) S. Yadav, C. Chen, and A. Ross. 2020. Relativistic Discriminator: A One-Class Classifier for Generalized Iris Presentation Attack Detection. In IEEE Winter Conference on Applications of Computer Vision (WACV). 2624–2633.
- Yadav and Ross (2021) S. Yadav and A. Ross. 2021. CIT-GAN: Cyclic Image Translation Generative Adversarial Network With Application in Iris Presentation Attack Detection. In IEEE Winter Conference on Applications of Computer Vision (WACV). 2411–2420.
- Yambay et al. (2017) D. Yambay, B. Becker, N. Kohli, D. Yadav, A. Czajka, K. Bowyer, S. Schuckers, R. Singh, M. Vatsa, A. Noore, D. Gragnaniello, C. Sansone, L. Verdoliva, L. He, Y. Ru, H. Li, N. Liu, Z. Sun, and T. Tan. 2017. LivDet iris 2017 — Iris liveness detection competition 2017. In IEEE Int. Joint Conf. on Biometrics (IJCB). 733–741.
- Yan et al. (2021) Z. Yan, L. He, Y. Wang, Z. Sun, and T. Tan. 2021. Flexible Iris Matching Based on Spatial Feature Reconstruction. IEEE Transactions on Biometrics, Behavior, and Identity Science (2021), 1–1.
- Yang et al. (2021) K. Yang, Z. Xu, and J. Fei. 2021. DualSANet: Dual Spatial Attention Network for Iris Recognition. In IEEE Winter Conference on Applications of Computer Vision (WACV). 888–896.
- Yuan et al. (2020) Y. Yuan, W. Chen, Y. Yang, and Z. Wang. 2020. In Defense of the Triplet Loss Again: Learning Robust Person Re-Identification with Fast Approximated Triplet Loss and Label Distillation. In IEEE Int. Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW). 1454–1463.
- Zanlorensi et al. (2020) L. A. Zanlorensi, H. Proença, and D. Menotti. 2020. Unconstrained Periocular Recognition: Using Generative Deep Learning Frameworks for Attribute Normalization. In Int. IEEE Int. Conf. on Image Processing. 1361–1365.
- Zhang et al. (2021) H. Zhang, Y. Bai, H. Zhang, J. Liu, X. Li, and Z. He. 2021. Local Attention and Global Representation Collaborating for Fine-grained Classification. In Int. Conf. on Pattern Recognition (ICPR). 10658–10665.
- Zhang et al. (2010) H. Zhang, Z. Sun, and T. Tan. 2010. Contact Lens Detection Based on Weighted LBP. In Int. Conf. on Pattern Recognition (ICPR). 4279–4282.
- Zhang et al. (2016a) Q. Zhang, H. Li, Z. He, and Z. Sun. 2016a. Image Super-Resolution for Mobile Iris Recognition. In Biometric Recognition. Springer, Cham, 399–406.
- Zhang et al. (2016b) Q. Zhang, H. Li, Z. Sun, Z. He, and T. Tan. 2016b. Exploring complementary features for iris recognition on mobile devices. In IEEE Int. Conf. on Biometrics (ICB). 1–8.
- Zhang et al. (2018) Q. Zhang, H. Li, Z. Sun, and T. Tan. 2018. Deep Feature Fusion for Iris and Periocular Biometrics on Mobile Devices. IEEE Transactions on Information Forensics and Security 13, 11 (2018), 2897–2912.
- Zhang et al. (2015) Q. Zhang, H. Li, M. Zhang, Z. He, Z. Sun, and T. Tan. 2015. Fusion of Face and Iris Biometrics on Mobile Devices Using Near-infrared Images. In Biometric Recognition. Springer, Cham, 569–578.
- Zhang et al. (2019) W. Zhang, X. Lu, Y. Gu, Y. Liu, X. Meng, and J. Li. 2019. A Robust Iris Segmentation Scheme Based on Improved U-Net. IEEE Access 7 (2019), 85082–85089.
- Zhao et al. (2017) H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. 2017. Pyramid Scene Parsing Network. In IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR). 6230–6239.
- Zhao et al. (2019) T. Zhao, Y. Liu, G. Huo, and X. Zhu. 2019. A Deep Learning Iris Recognition Method Based on Capsule Network Architecture. IEEE Access 7 (2019), 49691–49701.
- Zhao and Kumar (2017a) Z. Zhao and A. Kumar. 2017a. Accurate Periocular Recognition Under Less Constrained Environment Using Semantics-Assisted CNN. IEEE Transactions on Information Forensics and Security 12, 5 (2017), 1017–1030.
- Zhao and Kumar (2017b) Z. Zhao and A. Kumar. 2017b. Towards More Accurate Iris Recognition Using Deeply Learned Spatially Corresponding Features. In IEEE Int. Conf. on Computer Vision (ICCV). 3829–3838.
- Zhao and Kumar (2018) Z. Zhao and A. Kumar. 2018. Improving Periocular Recognition by Explicit Attention to Critical Regions in Deep Neural Network. IEEE Transactions on Information Forensics and Security 13, 12 (2018), 2937–2952.
- Zhao and Kumar (2019) Z. Zhao and A. Kumar. 2019. A deep learning based unified framework to detect, segment and recognize irises using spatially corresponding features. Pattern Recognition 93 (2019), 546 – 557.
- Zhou et al. (2016) B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. 2016. Learning Deep Features for Discriminative Localization. In IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR). 2921–2929.