Detecting multi-scale faces using attention-based feature fusion and smoothed context enhancement

Lei Shi, Xiang Xu, Ioannis A. Kakadiaris

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Though tremendous strides have been made in face detection, face detection remains a challenging problem due to scale variance. In this paper, we propose a smoothed attention network for performing scale-invariant face detection by taking advantage of feature fusion and context enhancement, which is dubbed SANet. To reduce the noise in the fused features at different levels, an Attention-guided Feature Fusion Module (AFFM) is designed. In addition, an exhaustive analysis of the role of attention mechanism on performance is conducted, which considers channel-wise, spatial-wise attentions and their combinations. To enrich the contextual information by using dilated convolution and avoid the gridding artifacts problem produced by dilated convolution, we propose a Smoothed Context Enhancement Module (SCEM). Our method achieves state-ofthe-art results on the UFDD dataset and comparable performance on the WIDER FACE, FDDB, PASCAL Faces, and AFW datasets.

Original languageEnglish (US)
Article number9091528
Pages (from-to)235-244
Number of pages10
JournalIEEE Transactions on Biometrics, Behavior, and Identity Science
Issue number3
StatePublished - Jul 2020


  • Context enhancement
  • Face detection
  • Feature fusion
  • Multi-scale

ASJC Scopus subject areas

  • Instrumentation
  • Computer Vision and Pattern Recognition
  • Computer Science Applications
  • Artificial Intelligence


Dive into the research topics of 'Detecting multi-scale faces using attention-based feature fusion and smoothed context enhancement'. Together they form a unique fingerprint.

Cite this