FairFaceGen
The FairFaceGen is a dataset of synthetic faces generated to study the impact of synthetic data generation on the performance and bias of face recognition (FR) models.
Description
The FairFaceGen is a dataset of synthetic faces generated to study the impact of synthetic data generation on the performance and bias of face recognition (FR) models. It has about 11+11=22K identities and is built using prompt-based balanced generation across age, race, and gender attributes using Flux.1-dev and Stable Diffusion v3.5 generators. The identity variations are generated using Arc2Face and IP-Adapter variants (SD15 and SDXL backbones, with FaceID/CLIP embeddings) generators that produce variations per identity.
Reference
@INPROCEEDINGS{fairfacegen_ijcb2025,
title = {Investigation of accuracy and bias in face recognition trained with synthetic data},
author = {Korshunov, Pavel and Kotwal, Ketan and Ecabert, Christophe and Vidit, Vidit and Mohammadi, Amir and Marcel, Sebastien},
booktitle = {2025 IEEE International Joint Conference on Biometrics (IJCB)},
pages = {1--10},
year = {2025},
organization = {IEEE}
}
Link: Investigation of accuracy and bias in face recognition trained with synthetic data