EdgeDoc: Hybrid CNN-Transformer Model for Accurate Forgery Detection and Localization in ID Documents

1Idiap Research Institute, 2UNIL

Idiap Research Report 2025.
Paper arXiv Code Database
Realism transferred images

Model architecture of the proposed EdgeDoc framework. The model uses NoisePrint and the original image as the input

EdgeDoc Video

Summary

The widespread availability of tools for manipulating images and documents has made it increasingly easy to forge digital documents, posing a serious threat to Know Your Customer (KYC) processes and remote onboarding systems. Detecting such forgeries is essential to preserving the integrity and security of these services. In this work, we present EdgeDoc, a novel approach for the detection and localization of document forgeries. Our architecture combines a lightweight convolutional transformer with auxiliary noiseprint features extracted from the images, enhancing its ability to detect subtle manipulations. EdgeDoc achieved third place in the ICCV 2025 DeepID Challenge, demonstrating its competitiveness. Experimental results on the FantasyID dataset show that our method outperforms baseline approaches, highlighting its effectiveness in realworld scenarios.

Proposed Pipeline

Our proposed architecture, EdgeDoc, is based on the XXS variant of the EdgeNeXt backbone. It extracts multi-scale feature maps from various stages of the network, which are then fed into a custom decoder structured in a U-Net style. The architecture of EdgeDoc is shown in Figure. The decoder is composed of upsampling blocks, each consisting of depthwise separable 2D convolutions, followed by 2D Layer Normalization and ReLU activations. For classification, we utilize a bottleneck head comprising global average pooling and fully connected layers. The final segmentation mask is generated via a pointwise (1×1) convolution applied to the decoder output.

Images from FantasyID Dataset

FantasyID Dataset consists of multiple attacks.

ROC on FantasyID dataset

Sample and Ground Truth from Fantasy ID dataset, together with the NoisePrint++, Confidence and Localization results from TruFor.


Performance on FantasyID Dataset

EdgeDoc achives high performance in the validation set of FantasyID dataset. Fusion with TruFor improves the generalization to other datasets.

ROC on FantasyID dataset

EdgeDoc achives high performance in the validation set of FantasyID dataset.


Performance on FantasyID Dataset

We trained the proposed EdgeDoc model using the training set of the Fantasy ID dataset and evaluated its performance on the corresponding validation set. In addition, we assessed several off-the-shelf baseline methods, including TruFor, for comparative analysis. The results of these evaluations are summarized in the table below, where EdgeDoc demonstrates superior performance compared to all other methods. Furthermore, we explored a fusion of EdgeDoc and TruFor using a weighted combination, which also yielded competitive results.

VIS-Thermal performance in MCXFace

Performance on FantasyID Dataset.


BibTeX

@article{george2025edgedoc,
  title={EdgeDoc: Hybrid CNN-Transformer Model for Accurate Forgery Detection and Localization in ID Documents},
  author={George, Anjith  and Marcel, Sebastien},
  journal={Idiap Research Report},
  year={2025}
}