The following table shows the comparison with MLLMs on the FaceXBench benchmark. The best performing model in each category is emboldened and the best model amongst all MLLMs is in purple.
The performance of FaceLLM models (FaceLLM-1B, FaceLLM-8B, and FaceLLM-38B) are compared in the following figures for different sub-tasks, including age estimation, gender prediction, race estimation, high-resolution face recognition, low-resolution face recognition, celebrity identification, face anti-spoofing, deepfake detection, attributes prediction, facial expression recognition, headpose estimation, face localization crowd counting, face parsing, and face tools retrieval.
The source code of our experiments as well as pretrained FaceLLM models are publicly available. We also release FairFaceGPT dataset.
@article{facellm2025,
author = {Hatef Otroshi Shahreza and S{\'e}bastien Marcel},
title = {FaceLLM: A Multimodal Large Language Model for Face Understanding},
journal = {arXiv preprint arXiv:2507.10300},
year = {2025}
}