TY - JOUR
T1 - Joint prototype and metric learning for image set classification
T2 - Application to video face identification
AU - Leng, Mengjun
AU - Moutafis, Panagiotis
AU - Kakadiaris, Ioannis A.
N1 - Funding Information:
This research was funded in part by the U.S. Army Research Laboratory (W911NF-13-1-0127) and the UH Hugh Roy and Lillie Cranz Cullen Endowment Fund. All statements of fact, opinion or conclusions contained herein are those of the authors and should not be construed as representing the official views or policies of the sponsors.
Publisher Copyright:
© 2016 Elsevier B.V.
PY - 2017/2/1
Y1 - 2017/2/1
N2 - In this paper, we address the problem of image set classification, where each set contains a different number of images acquired from the same subject. In most of the existing literature, each image set is modeled using all its available samples. As a result, the corresponding time and storage costs are high. To address this problem, we propose a joint prototype and metric learning approach. The prototypes are learned to represent each gallery image set using fewer samples without affecting the recognition performance. A Mahalanobis metric is learned simultaneously to measure the similarity between sets more accurately. In particular, each gallery set is represented as a regularized affine hull spanned by the learned prototypes. The set-to-set distance is optimized via updating the prototypes and the Mahalanobis metric in an alternating manner. To highlight the importance of representing image sets using fewer samples, we analyzed the corresponding test time complexity with respect to the number of images used per set. Experimental results using YouTube Celebrity, YouTube Faces, and ETH-80 datasets illustrate the efficiency on the task of video face recognition, and object categorization.
AB - In this paper, we address the problem of image set classification, where each set contains a different number of images acquired from the same subject. In most of the existing literature, each image set is modeled using all its available samples. As a result, the corresponding time and storage costs are high. To address this problem, we propose a joint prototype and metric learning approach. The prototypes are learned to represent each gallery image set using fewer samples without affecting the recognition performance. A Mahalanobis metric is learned simultaneously to measure the similarity between sets more accurately. In particular, each gallery set is represented as a regularized affine hull spanned by the learned prototypes. The set-to-set distance is optimized via updating the prototypes and the Mahalanobis metric in an alternating manner. To highlight the importance of representing image sets using fewer samples, we analyzed the corresponding test time complexity with respect to the number of images used per set. Experimental results using YouTube Celebrity, YouTube Faces, and ETH-80 datasets illustrate the efficiency on the task of video face recognition, and object categorization.
KW - Image set classification
KW - Metric learning
KW - Prototype learning
KW - Video face recognition
UR - http://www.scopus.com/inward/record.url?scp=84978910790&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84978910790&partnerID=8YFLogxK
U2 - 10.1016/j.imavis.2016.06.005
DO - 10.1016/j.imavis.2016.06.005
M3 - Article
AN - SCOPUS:84978910790
VL - 58
SP - 204
EP - 213
JO - Image and Vision Computing
JF - Image and Vision Computing
SN - 0262-8856
ER -