Hand gesture recognition using high-density surface electromyography (HD-sEMG) has gained increasing attention recently due its advantages of high spatio-temporal resolution. Convolutional neural networks (CNN) have also recently been implemented to learn the spatio-temporal features from the instantaneous samples of HD-sEMG signals. While the CNN itself learns the features from the input signal it has not been considered whether certain pre-processing techniques can further improve the classification accuracies established by previous studies. Therefore, common pre-processing techniques were applied to a benchmark HD-sEMG dataset (CapgMyo DB-a) and their validation accuracies were compared. Monopolar, bipolar, rectified, common-average referenced, and Laplacian spatial filtered configurations of the HD-sEMG signals were evaluated. Results showed that the baseline monopolar HD-sEMG signals maintained higher prediction accuracies versus the other signal configurations. The results of this study discourage the use of extra pre-processing steps when using convolutional networks to classify the instantaneous samples of HD-sEMG for gesture recognition.