Weakly Supervised 3D Face Reconstruction with Joint Spatial and Frequency Domain Information
DOI:
https://doi.org/10.54691/39t56g21Keywords:
Deep Learning; 3D Deformable Model; 3D Face Reconstruction; Weakly Supervised Method; Discrete Fourier Transform.Abstract
3D face reconstruction is an important research direction in computer vision, and its goal is to recover a 3D face model from a single face picture. In the absence of real 3D face data, how to reconstruct a 3D face with a high degree of realism has become a hot research topic in recent years. Existing reconstruction algorithms usually rely on 3D labels generated from a large number of 2D face images as training data, however, inaccurate data will seriously affect the reconstruction quality. For this reason, the paper proposes a joint spatial-frequency domain decoupled weak supervision to achieve 3D face reconstruction, the main idea of which is to construct a multi-level loss function by using the weakly supervised information extracted from the spatial domain, and separating the frequency-domain information between the input and the rendered image in the frequency domain, and minimizing the difference between the two by difference computation. The method combines deep learning with 3D deformable models to reconstruct 3D models with high quality texture and shape from only a single face image. Quantitative experiments on the AFLW2000-3D and MICC Florence datasets show that the normalized average error in the small pose interval is as low as 2.42%, and the face reconstruction accuracy in the outdoor scene is 0.98 0.22 mm. Qualitative experiments on the MoFa-test, MICA datasets show that when faced with reconstruction with different poses, lighting, and expressions , our method outperforms other state-of-the-art reconstruction methods.
Downloads
References
[1] Z Abate, Andrea F., et al. "2D and 3D face recognition: A survey." Pattern recognition letters 28.14 (2007): 1885-1906 [2] Diao, Haojie, et al. "3D Face Reconstruction Based on a Single Image: A Review." IEEE Access (2024).
[3] JingTing W, et al. Review of Single-Image 3D Face Reconstruction Methods. Computer Engineering and Applications 2023,59(17):1-21.
[4] Yue, W, et al. 3D Face Shape and Texture Reconstruction Based on Weakly Supervised Learning. Computer Systems & Applications, 2020, 29(11):183-189
[5] Blanz, Volker, and Thomas Vetter. "A morphable model for the synthesis of 3D faces." Seminal Graphics Papers: Pushing the Boundaries, Volume 2023. 157-164.
[6] Richardson, Elad, Matan Sela, and Ron Kimmel. "3D face reconstruction by learning from synthetic data." 2016 fourth international conference on 3D vision (3DV). IEEE, 2016.
[7] Tuan Tran, Anh, et al. "Regressing robust and discriminative 3D morphable models with a very deep neural network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[8] Tewari, Ayush, et al. "Mofa: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction." Proceedings of the IEEE international conference on computer vision workshops. 2017.
[9] Chen Z, Wang Y, Guan T, et al. Transformer-based 3d face reconstruction with end-to-end shape-preserved domain transfer[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(12): 8383-8393.
[10] Genova K, Cole F, Maschinot A, et al. Unsupervised training for 3d morphable model regression[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8377-8386.
[11] Deng Y, Yang J, Xu S, et al. Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set [C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 2019: 0-0.
[12] Lin J, Yuan Y, Shao T, et al. Towards high-fidelity 3d face reconstruction from in-the-wild images using graph convolutional networks[C]//Proceedings of the ieee/cvf conference on computer vision and pattern recognition. 2020: 5891-5900.
[13] Guo J, Zhu X, Yang Y, et al. Towards fast, accurate and stable 3d dense face alignment[C]//European Conference on Computer Vision. Cham: Springer International Publishing, 2020: 152-168.
[14] Feng Y, Feng H, Black M J, et al. Learning an animatable detailed 3D face model from in-the-wild images[J]. ACM Transactions on Graphics (ToG), 2021, 40(4): 1-13.
[15] Jiang L, Dai B, Wu W, et al. Focal frequency loss for image reconstruction and synthesis [C]// Proceedings of the IEEE/CVF international conference on computer vision. 2021: 13919-13929.
[16] Huang G B, Mattar M, Berg T, et al. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments[C]//Workshop on faces in'Real-Life'Images: detection, alignment, and recognition. 2008.
[17] Lee C H, Liu Z, Wu L, et al. Maskgan: Towards diverse and interactive facial image manipulation [C]// Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 5549-5558.
[18] Yu C, Wang J, Peng C, et al. Bisenet: Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 325-341.
[19] Paysan P, Knothe R, Amberg B, et al. A 3D face model for pose and illumination invariant face recognition[C]//2009 sixth IEEE international conference on advanced video and signal based surveillance. IEEE, 2009: 296-301.
[20] Cao C, Weng Y, Zhou S, et al. Facewarehouse: A 3d facial expression database for visual computing[J]. IEEE Transactions on Visualization and Computer Graphics, 2013, 20(3): 413-425.
[21] Ravi N, Reizenstein J, Novotny D, et al. Accelerating 3d deep learning with pytorch3d[J]. arXiv preprint arXiv:2007.08501, 2020.
[22] Feng Z H, Kittler J, Awais M, et al. Wing loss for robust facial landmark localisation with convolutional neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 2235-2245.
[23] Cao Q, Shen L, Xie W, et al. Vggface2: A dataset for recognising faces across pose and age[C]//2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, 2018: 67-74.
[24] Liu Z, Luo P, Wang X, et al. Deep learning face attributes in the wild[C]//Proceedings of the IEEE international conference on computer vision. 2015: 3730-3738.
[25] Gecer B, Ploumpis S, Kotsia I, et al. Ganfit: Generative adversarial network fitting for high fidelity 3d face reconstruction[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 1155-1164.
Downloads
- Views: 2 | Downloads: 1 PDF
Published
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.