TY - JOUR
T1 - Detecting multi-scale faces using attention-based feature fusion and smoothed context enhancement
AU - Shi, Lei
AU - Xu, Xiang
AU - Kakadiaris, Ioannis A.
N1 - Funding Information:
This work was supported by the U.S. Department of Homeland Security under Grant 2017-ST-BTI-0001-0201. This grant is awarded to the Borders, Trade, and Immigration (BTI) Institute: A DHS Center of Excellence led by the University of Houston, and includes support for the project EDGE. This article was recommended for publication by Associate Editor Y. Wang upon evaluation of the reviewers' comments.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Though tremendous strides have been made in face detection, face detection remains a challenging problem due to scale variance. In this paper, we propose a smoothed attention network for performing scale-invariant face detection by taking advantage of feature fusion and context enhancement, which is dubbed SANet. To reduce the noise in the fused features at different levels, an Attention-guided Feature Fusion Module (AFFM) is designed. In addition, an exhaustive analysis of the role of attention mechanism on performance is conducted, which considers channel-wise, spatial-wise attentions and their combinations. To enrich the contextual information by using dilated convolution and avoid the gridding artifacts problem produced by dilated convolution, we propose a Smoothed Context Enhancement Module (SCEM). Our method achieves state-ofthe-art results on the UFDD dataset and comparable performance on the WIDER FACE, FDDB, PASCAL Faces, and AFW datasets.
AB - Though tremendous strides have been made in face detection, face detection remains a challenging problem due to scale variance. In this paper, we propose a smoothed attention network for performing scale-invariant face detection by taking advantage of feature fusion and context enhancement, which is dubbed SANet. To reduce the noise in the fused features at different levels, an Attention-guided Feature Fusion Module (AFFM) is designed. In addition, an exhaustive analysis of the role of attention mechanism on performance is conducted, which considers channel-wise, spatial-wise attentions and their combinations. To enrich the contextual information by using dilated convolution and avoid the gridding artifacts problem produced by dilated convolution, we propose a Smoothed Context Enhancement Module (SCEM). Our method achieves state-ofthe-art results on the UFDD dataset and comparable performance on the WIDER FACE, FDDB, PASCAL Faces, and AFW datasets.
KW - Context enhancement
KW - Face detection
KW - Feature fusion
KW - Multi-scale
UR - http://www.scopus.com/inward/record.url?scp=85119184138&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119184138&partnerID=8YFLogxK
U2 - 10.1109/TBIOM.2020.2993242
DO - 10.1109/TBIOM.2020.2993242
M3 - Article
AN - SCOPUS:85119184138
VL - 2
SP - 235
EP - 244
JO - IEEE Transactions on Biometrics, Behavior, and Identity Science
JF - IEEE Transactions on Biometrics, Behavior, and Identity Science
SN - 2637-6407
IS - 3
M1 - 9091528
ER -