The goal of this work is to identify principles of designing a deep neural network for 3D face reconstruction from a single image. To make the evaluation simple, we generated a synthetic dataset and used it for evaluation. We conducted extensive experiments using an end-to-end face reconstruction algorithm using E2FAR and its variants, and analyzed the reason why it can be successfully applied for 3D face reconstruction. From the comparative studies, we conclude that feature aggregation from different layers is a key point to training better neural networks for 3D face reconstruction. Based on these observations, a face reconstruction feature aggregation network (FR-FAN) is proposed, which obtains significant improvements compared with baselines on the synthetic validation set. We evaluate our model on existing popular indoor and in-the-wild 2D-3D datasets. Extensive experiments demonstrate that FR-FAN performs 16.50% and 9.54% better than E2FAR on BU-3DFE and JNU-3D, respectively. Finally, the sensitivity analysis we performed on controlled datasets demonstrates that our designed network is robust to large variations of pose, illumination, and expressions.