TY - GEN
T1 - Webly Supervised Learning Meets Zero-shot Learning
T2 - 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
AU - Niu, Li
AU - Veeraraghavan, Ashok
AU - Sabharwal, Ashu
N1 - Funding Information:
This work is supported by NIH 7000000356 and NSF IIS-1652633. This work is also supported by the NGA NHARP program HM0476-15-1-0007.
Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/14
Y1 - 2018/12/14
N2 - Fine-grained image classification, which targets at distinguishing subtle distinctions among various subordinate categories, remains a very difficult task due to the high annotation cost of enormous fine-grained categories. To cope with the scarcity of well-labeled training images, existing works mainly follow two research directions: 1) utilize freely available web images without human annotation; 2) only annotate some fine-grained categories and transfer the knowledge to other fine-grained categories, which falls into the scope of zero-shot learning (ZSL). However, the above two directions have their own drawbacks. For the first direction, the labels of web images are very noisy and the data distribution between web images and test images are considerably different. For the second direction, the performance gap between ZSL and traditional supervised learning is still very large. The drawbacks of the above two directions motivate us to design a new framework which can jointly leverage both web data and auxiliary labeled categories to predict the test categories that are not associated with any well-labeled training images. Comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed framework.
AB - Fine-grained image classification, which targets at distinguishing subtle distinctions among various subordinate categories, remains a very difficult task due to the high annotation cost of enormous fine-grained categories. To cope with the scarcity of well-labeled training images, existing works mainly follow two research directions: 1) utilize freely available web images without human annotation; 2) only annotate some fine-grained categories and transfer the knowledge to other fine-grained categories, which falls into the scope of zero-shot learning (ZSL). However, the above two directions have their own drawbacks. For the first direction, the labels of web images are very noisy and the data distribution between web images and test images are considerably different. For the second direction, the performance gap between ZSL and traditional supervised learning is still very large. The drawbacks of the above two directions motivate us to design a new framework which can jointly leverage both web data and auxiliary labeled categories to predict the test categories that are not associated with any well-labeled training images. Comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed framework.
UR - http://www.scopus.com/inward/record.url?scp=85061186992&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061186992&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2018.00749
DO - 10.1109/CVPR.2018.00749
M3 - Conference contribution
AN - SCOPUS:85061186992
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 7171
EP - 7180
BT - Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 June 2018 through 22 June 2018
ER -