In FaceRecBench, we evaluate MLLMs on popular face recognition datasets using a similar protocol. Therefore, similar to a face recognition model in verification scenario, the MLLM task is to verify if two images belong to the same person or not based on their face images. To this end, we design a prompt template to feed the MLLMs with two face images and ask them to answer if they belong to the same person or not:
The following table shows benchmark results of several MLLMs on face recognition datasets are reported in our paper:
We also comapre the performance of different MLLMs on Racial Faces in-the-Wild (RFW) dataset in the following table:
[Source Code] The source code of our FaceRecBench is publicly available: https://github.com/idiap/facerecbench
@article{facerecbench2025,
author = {Hatef Otroshi Shahreza and S{\'e}bastien Marcel},
title = {Benchmarking Multimodal Large Language Models for Face Recognition},
journal = {arXiv preprint arXiv:2510.14866},
year = {2025}
}