FantasyID: A dataset for detecting digital manipulations of ID-documents

1Idiap Research Institute, 2UNIL

News

Arabic ID sample
Arabic ID Template
Persian ID sample
Persian ID Template
French ID sample
French ID Template
Russian ID sample
Russian ID Template

Examples of different fantasy templates in the dataset. FantasyID contains 13 different templates with varied languages: Arabic, Chinese, Hindi, French, Persian, Portuguese, Russian, Turkish, Ukrainian, and English. The faces used are under permissive licenses and all the textual information is fictional. We print these cards on PVC cards and capture with three different devices.

Abstract

Advancements in image generation led to the availability of easy-to-use tools for malicious actors to create forged images. These tools pose a serious threat to the widespread Know Your Customer (KYC) applications, requiring robust systems for detection of the forged Identity Documents (IDs). To facilitate the development of the detection algorithms, in this paper, we propose a novel publicly available (including commercial use) dataset, FantasyID, which mimics real-world IDs but without tampering with legal documents and, compared to previous public datasets, it does not contain generated faces or specimen watermarks. FantasyID contains ID cards with diverse design styles, languages, and faces of real people. To simulate a realistic KYC scenario, the cards from FantasyID were printed and captured with three different devices, constituting the bonafide class. We have emulated digital forgery/injection attacks that could be performed by a malicious actor to tamper the IDs using the existing generative tools. The current state-of-the-art forgery detection algorithms, such as TruFor, MMFusion, UniFD, and FatFormer, are challenged by FantasyID dataset. It especially evident, in the evaluation conditions close to practical, with the operational threshold set on validation set so that false positive rate is at 10%, leading to false negative rates close to 50% across the board on the test set. The evaluation experiments demonstrate that FantasyID dataset is complex enough to be used as an evaluation benchmark for detection algorithms

Dataset

Our dataset is split into train-val and test set. Train-val consists of 786 bonafide images and 1572 manipulates images. Test comprises of 300 bonafide and 1085 manipulated image. The bonafide is different between Train-val and Test along with the kind of manipulations. Manipulations are done on faces and text fields. Below is the finer details of our dataset.

Category Sub-Category Count Devices/Sources Description and Purpose
Bonafide Cards Generation 362 AMFD, Face London, WMCA, and Flickr datasets Thirteen unique design styles in ten languages. Random but realistic personal info.
Print 362 Evolis Primacy 2 printer Printed on physical plastic cards.
Capture 1086 iPhone 15 Pro, Huawei Mate 30, office scanner Plastic cards were captured using three devices.
Forged Cards Digital manipulation 786 InSwapper Each face in captured IDs from train-val set is swapped with another face.
Digital manipulation 786 Facedancer Each face in captured IDs from train-val set is swapped with another face.
Digital manipulation (Attack-3) 150 Facedancer Faces in captured IDs from subset of test set are swapped.
Digital manipulation 786 DiffSTE Parts of personal info in captured IDs from train-val set were replaced by another text.
Digital manipulation 786 Textdiffuser2 Parts of personal info in captured IDs from train-val set were replaced by another text.
Digital manipulation (Attack-1) 786 Finetuned-Textdiffuser2 Parts of personal info in captured IDs from train-val set were replaced by another text.
Digital manipulation (Attack-3) 149 Finetuned-Textdiffuser2 Parts of personal info in captured IDs from test set were replaced by another text.

Digital Manipulations

Example of digital manipulations in FantasyID. Faces are swapped and textual details are digtially altered.

org ID sample
ID captured with Iphone15 Pro
Manipulted ID sample
ID with digitally manipulated face and text regions

Benchmarking

We benchmark our dataset with following image manipulation detection algorithms: TruFor, MMFusion, UniFD, and FatFormer. With the operational threshold set on validation set so that false positive rate is at 10%, leading to false negative rates close to 50% across the board on the test set for all the baselines, showing the difficulty of the FantasyID dataset. The results are shown in the table below.
Model Protocol ACC AUC F1 FPR FNR HTER
TruForall65.993.580.74.962.033.4
TruForAttack-167.899.579.00.162.031.1
TruForAttack-253.555.847.634.762.048.3
TruForAttack-367.899.955.30.062.031.0
MMFusionall55.194.473.74.047.725.8
MMFusionAttack-155.299.867.00.147.723.9
MMFusionAttack-254.561.529.828.047.737.8
MMFusionAttack-355.299.130.00.047.723.8
UniFDall50.052.07.78.392.750.5
UniFDAttack-150.054.312,07.492.750.0
UniFDAttack-250.048.453.37.392.750.0
UniFDAttack-350.043.553.514.192.753.4
FatFormerall48.853.515.66.592.349.4
FatFormerAttack-149.357.620.41.192.346.7
FatFormerAttack-248.343.653.824.092.358.2
FatFormerAttack-346.742.351.816.892.354.6

BibTeX


      @misc{korshunov2025fantasyiddatasetdetectingdigital,
      title={FantasyID: A dataset for detecting digital manipulations of ID-documents}, 
      author={Pavel Korshunov and Amir Mohammadi and Vidit Vidit and Christophe Ecabert and Sébastien Marcel},
      year={2025},
      eprint={2507.20808},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2507.20808}, 
      }