Traceable and Authenticable Image Tagging for Fake News Detection
Abstract
To prevent fake news images from misleading the public, it is desirable not only to verify the authenticity of news images but also to trace the source of fake news, so as to provide a complete forensic chain for reliable fake news detection. To simultaneously achieve the goals of authenticity verification and source tracing, we propose a traceable and authenticable image tagging approach that is based on a design of Decoupled Invertible Neural Network (DINN). The designed DINN can simultaneously embed the dual-tags, i.e., authenticable tag and traceable tag, into each news image before publishing, and then separately extract them for authenticity verification and source tracing. Moreover, to improve the accuracy of dual-tags extraction, we design a parallel Feature Aware Projection Model (FAPM) to help the DINN preserve essential tag information. In addition, we define a Distance Metric-Guided Module (DMGM) that learns asymmetric one-class representations to enable the dual-tags to achieve different robustness performances under malicious manipulations. Extensive experiments, on diverse datasets and unseen manipulations, demonstrate that the proposed tagging approach achieves excellent performance in the aspects of both authenticity verification and source tracing for reliable fake news detection and outperforms the prior works.
1 Introduction

In the We-media era, news content with malicious manipulation, i.e., fake news, is easily produced and distributed on social media. A large amount of fake news, especially provocative fake news images, is undermining public credibility and influencing social stability, resulting in many serious social security issues [24]. To detect fake news images, the existing methods usually design detectors by capturing the traces of malicious manipulations such as textual information [19, 23, 32], visual features [11, 31, 4], and multi-modal fusion of features [27, 9] from news images. Although those methods have achieved desirable performance for verifying the authenticity of news images, they do not take source tracing into consideration. The news source traceability is essential for the public to know where the news is published to enhance the trustworthiness in news. Thus, the authenticity verification and source traceability can jointly provide a complete forensic chain for reliable fake news detection.
In the existing deep watermarking methods, robust watermarking [41, 8, 35] or fragile watermarking [22, 2, 38] can either achieve copyright traceability or content integrity authentication. To achieve the two purposes simultaneously, some dual-watermarking methods [21, 26, 16, 6] have been proposed to manually embed the two types of watermarks, and thus they suffer from two common issues. 1) Once the two types of watermarks are manually embedded into an image, they would affect each other, which makes them hard to decouple and extract; 2) It is hard to guarantee the two types of embedded watermarks achieve different robustness performances to malicious manipulations for different purposes.
Therefore, it is an interesting and challenging task to achieve the goals of news authenticity verification and source tracing simultaneously. To this end, we propose an image tagging approach based on a design of Decoupled Invertible Neural Network (DINN). In this approach, the dual-tags, i.e., traceable tag and authenticable tag, can be invisibly embedded into the news images before publishing, and be separately extracted for authenticity verification and source tracing, respectively. The framework of the proposed approach is shown in Fig. 1. In summary, the contributions of this paper include:
-
•
We propose a novel proactive image tagging scheme to achieve both news image authenticity verification and source tracing for reliable fake news detection.
-
•
We design a Decoupled Invertible Neural Network (DINN) to simultaneously embed two types of invisible tags, i.e., authenticable tag and traceable tag, into each news image, and extract the two tags from all manipulated news images separately without interfering with each other.
-
•
We design a double Feature Aware Projection Model (FAPM) and a Distance Metric-Guided Module (DMGM) to improve the recovery accuracy of embedded tags and enable the dual-tags to achieve different robustness performances to malicious manipulations1.
2 Related Works

2.1 Fake News Detection
Generally, existing fake news detection methods try to capture unusual patterns by extracting text information [18, 19, 23, 32], visual features [24, 11, 31, 4], and multi-modal fusion of features [33, 27, 40, 9] from news content. As the one main component of news, textual content [18, 32] is widely used as the main domain for detecting fake news. Ma et al. [18] adopted a 2-layer Gated Recurrent Unit model to extract the hidden representations, which can capture the variation of contextual information of relevant posts over a period. Besides, there are a lot of algorithms that utilize different NLP models [37, 30], autoencoder [20], and GAN [5] to learn more comprehensive representations of textual content to detect fake news. Moreover, several algorithms take stance classification [19, 32] and writing style [23] into account to detect fake news.
In addition, as another component of news, visual features such as news images or videos are more easily disseminated and maliciously manipulated. Some methods [11, 31, 36] analyze the basic statistical features from news image amount, image type, and image popularity, while some other methods extract forensic features for fake image detection [4, 3]. Besides, Qi et al. [24] proposed Multi-domain Visual Neural Network (MVNN) to capture the complex patterns of fake news images in the frequency domain and extract visual features in the pixel domain, which improves the performance of fake news image detection. Since the representation capability of one specific modal feature is limited, some multi-modal methods [33, 27, 40, 9] combine the visual features and textual features to explore their consistency to verify the fake news.
All the above content-based fake news detection methods detect fake news passively by analyzing textural features and/or visual features of news. Although these methods improve fake news detection from different levels, these passive detectors do not consider the news traceability to achieve reliable fake news detection. Hence, to provide a reliable forensic chain for fake news detection, we propose a proactive tagging scheme by embedding traceable and authenticable tags into original news images to detect fake images and trace the news source.
2.2 Watermarks
Digital image watermarking can achieve traceability and content integrity authentication. To achieve source traceability, robust watermarking [35, 41, 29, 8] is proposed to resist all kinds of attacks. Zhu et al. [41] proposed the end-to-end trainable framework based on autoencoder, i.e., HiDDeN, which inserts noise layers between encoding and decoding to simulate image distortion. Subsequently, Jia et al. [8] proposed Mini-Batch of Real and Simulated JPEG compression (MBRS) to enhance the JPEG robustness based on autoencoder. However, the two sub-networks in autoencoder-based methods have two sets of parameters, which cause color distortion and low invisibility. HiNet [12] and ISN [17] have been proposed to integrate the embedding and extraction process into an invertible model to improve image quality and security, which are based on INN. Besides, RIIS [35] introduced a conditional normalizing flow to model the distribution of high-frequency components as lost information and add a distortion model to simulate the distortion for robust watermarking.
Besides, to achieve content integrity authentication, fragile watermarks-based methods [22, 2, 38] are proposed. Neekhara et al. [22] proposed to proactive embed a semi-fragile neural watermark into real images for digital media authentication. Besides, Asnani et al. [2] proposed a template estimation scheme to proactively detect manipulated images by embedding orthogonal and learnable templates into real images. It achieves better image manipulation detection performance on unseen generative models.
Some dual-watermarking methods [21, 15, 26, 16, 6] have been proposed to embed different watermarks to achieve multi-purposes of image protection. Lu and Liao [16] embedded the robust and fragile watermarks in an image for copyright protection and image authentication. For the same purposes, Liu et al. [26] embedded one watermark in YCbCr color space by using DWT and another watermark in RGB components. Besides, TRLH [6] is proposed to embed fragile and blind dual-watermarks for image tamper detection and self-recovery based on lifting wavelet and halftoning techniques. However, these methods embed watermarks by extracting features manually and thus cannot flexibly handle different attacks.
For fake news detection, the existing methods cannot be directly used to achieve good performance on both news traceability and authentication simultaneously. Therefore, we propose to embed the dual-tags, i.e., traceable tags and authenticable tags, into each news image to achieve both authenticity identification and source tracing to build a complete forensic chain in fake news detection.
3 Proposed Tagging Approach
In this section, we propose a novel proactive tagging scheme by embedding dual-tags into each news image. The dual embedded tags can achieve different robustness performances to malicious manipulations. Then, they can be dully decoupled and extracted to achieve news authentication and traceability simultaneously for reliable fake news detection.
3.1 Network Architecture
The architecture of our proposed approach is shown in Fig. 2. It contains three main structures: 1) The forward embedding process, 2) Simulate manipulations attack in practical, and 3) The reverse revealing process. Note that Fig. 2 shows the training process of our proposed scheme, and the inference process is the same as the training process.
First, traceable tags and authenticable tags are embedded into news images with three branches and then are extracted from the news images under a variety of manipulations. In the process, the lost information of dual-tags with the conditional features extracted from tagged images is encoded into two sets of latent variables under Feature Aware Projection Model (FAPM) to improve the extraction accuracy of the embedded tags.
With the consideration of news characteristics, we simulate the manipulations that news images may undergo, including moderate manipulation and malicious manipulation. The former will not affect the news authenticity, such as compression, adjusting contrast, blur, etc. While the latter will tamper with the content of news images to mislead the public, including splicing, copy-move, etc. Therefore, the images under moderate and malicious manipulations are defined as real images and false images, respectively.
In the revealing process, with the help of Distance Metric-Guided Module (DMGM), the pre-embedded traceable tags and authenticable tags are extracted with different accuracy from the manipulated news images for authenticity verification and source tracing.
To facilitate the embedding process, we normalize the tags, such as logos, news information, etc., as bitstreams. We adopt the normalization strategy similar to that in IWN [34].

3.2 Decoupled Invertible Neural Network (DINN)
Different from the existing INN-based methods [12, 17, 35, 34], we design to embed the dual-tags, i.e., traceable tag and authenticable tag, and they should show different robustness performances under image manipulations. In our design, authenticable tags should be robust to moderate manipulations, while fragile to malicious manipulations. And traceable tags should be robust to all manipulations.
The two branches of existing INNs share the same network parameters if using classical INN, and thus it is hard for those INNs to achieve different robustness performances for the two embedded tags.
To achieve our goals, we propose the Decoupled Invertible Neural Network (DINN) to decouple the embedding and revealing processes of traceable tag and authenticable tag, so that these dual-tags will not interfere each other. Before training the proposed network, each original news image is first decomposed by DWT with Haar wavelet kernel to obtain , and traceable tag and authenticable tag are normalized to obtain and . Then, , and are served as the inputs of the forward embedding process to output the tagged news images and the corresponding lost information and . Here, we encode and into a set of predefined distributed latent variables via conditional INNs for reserving more tag information. Meanwhile, the tagged news images would undergo moderate manipulations and malicious manipulations, resulting in real news images and fake news images , respectively. For the reverse revealing process of DINN, , , and revealed lost information of and are used as inputs, and then the and are extracted from both and . According to the extraction accuracy rates of tags to verify the news source and authenticity.
As shown in Fig. 2, DINN consists of several decoupled invertible blocks with the same structure. For the -th decoupled invertible block, the input is and its output is . Formally, the forward process is calculated as follows:
(1) |
(2) | ||||
(3) |
Where, , , , , , and are sub-modules with convolution operations, is the Exponential function, and is the Hadamard product operation.
In the reverse revealing process of DINN, the concatenated images of and , and revealed lost information is fed in, and then four tags including and from both and are extracted. Specifically, in the revealing process, the information inverse direction is from the -th decoupled invertible block to -th decoupled invertible block, as shown in Fig. 2. For the -th triplet invertible block, the inputs are the revealed lost information and , which are randomly sampled from distributions. represents the manipulated news images, which is the concatenation of real news images and fake news images , as the input in the revealing process of DINN. Accordingly, the reverse process is calculated as follows:
(4) | ||||
(5) | ||||
(6) |
After the last revealing block, both the embedded traceable tag and authenticable tag are extracted from each news image. It is noted that the extracted authenticable tag from fake news images should be quite different from the original one. And other tags should be extracted correctly from other news images.
3.3 Feature Aware Projection Model (FAPM)
Although the proposed DINN can decouple the embedding and extarcting of dual-tags, the desirable extraction accuracy performance still can not be guaranteed. To improve method usability, we design FAPM as the auxiliary to further support the decoupling process and guarantee extraction performance.
Inspired by the invertible image decolorization [39] and the model of content-aware noise projection in RIIS [35], we encode some lost information of and into two sets of pre-defined distributed latent variables via conditional INNs to improve the extraction performance. The embedded and can be efficiently revealed by randomly re-sampling new sets of pre-defined distributed variables.

In the embedding process, we expect to preserve more tag information in lost information and , to perfectly reveal the embedded tags. Besides, to completely decouple the dual-tags and achieve different extraction goals, we employ conditional INN to encode the lost information of and . With the conditional features extracted from tagged images , the lost information is converted to two sets of latent variables and , which follow a pre-defined uniform distribution .
The specific scheme is shown in Fig. 3. Taking the traceable tag as an example, its corresponding lost information is also divided into two groups as input to the INN with the conditional feature from . We employ the first two layers in Resnet18 [7] as the feature extractor and to extract features as the conditions on the mapping of and , respectively. For the -th conditional invertible block of FAPM, the forward process of lost information is calculated as shown in Eq. 7 and Eq. 8. The operation of authenticable tags is the same as the operation of traceable tags. The forward calculated process is shown in Eq. 9 and Eq. 10.
(7) | ||||
(8) |
(9) | ||||
(10) |
In the revealing process, the lost information is recovered through FAPM, and then the embedded tags are recovered from news images.
3.4 Distance Matric-Guided Module (DMGM)
In the revealing process, we extract the traceable tag from all news images to verify the sources of news image. To effectively detect fake images, authenticable tag should be robust to moderate manipulation, while fragile to malicious manipulation, and the robustness can be evaluated by extraction error rates. Although extraction performance can be controlled by the designed loss function, the problem is more complex in a practical scenario since the types of manipulations are probably unseen unseen to the models. Thus, the generalization of the proposed model is still very important in fake news detection. Thus, we design a distance metric-guided modulation for tag extraction against the unseen types of manipulations.
The specific design is shown in Fig. 4. We define the extracted correct authenticable tag from real news images as and the extracted wrong authenticable tag from fake news images as . The target feature spaces are that should be near the origin and converge to a hypersphere of radius to maintain intra-class compactness. On the contrary, is expected to be away from the smaller hypersphere by a predefined margin to ensure inter-class separation between and . The designed loss function is as follows, as inspired by the attack detection of unseen face presentation [13].
(11) | ||||
Where, , . The distance is calculated between extracted tag samples and the embedded tags by using the square of 1-norm. Besides, and are the parameters to control the weights of loss contributed by extracted tag samples of different types.
3.5 Loss Function
The total loss function is composed of four kinds of losses: the embedding and revealing losses of the main model, feature aware projection loss, distance metric-guided loss, and the low-frequency wavelet loss.
Embedding loss. The forward embedding process aims to embed and into original images , to generate tagged news images . To achieve this goal that should be indistinguishable from , the embedding loss is defined by
(12) |
Where, measures the difference between original news images and tagged news images, and the difference can be measured or norm.
Revealing loss. The reveal process aims to extract the traceable tags from real news images and fake news images, and the authenticable tags from real news images.
(13) |
(14) |
(15) |
Since we expect to extract the wrong authenticable tag from fake news images, there is no loss between and .
Feature aware projection loss. We set the pre-defined distribution as uniform distribution .
(16) |
(17) |
Where, is the pixel-level distance function, is the constant matrix, and and are the corresponding distributions.
Settings | News source tracing | Authenticity verification | Image quality | |||
PSNR | SSIM | |||||
0.1200 | 0.1427 | 0.0700 | 0.0811 | 33.1790 | 0.8751 | |
0.0686 | 0.0939 | 0.0608 | 0.0722 | 34.9058 | 0.9074 | |
0.0275 | 0.0375 | 0.0558 | 0.2367 | 35.2095 | 0.9170 |
Datasets | 128128 | 256256 | ||||||||||
News source tracing | Authenticity verification | Image quality | News source tracing | Authenticity verification | Image quality | |||||||
PSNR | SSIM | PSNR | SSIM | |||||||||
DIV2K [1] | 0.0275 | 0.0375 | 0.0558 | 0.2367 | 35.2095 | 0.917 | 0.0017 | 0.0483 | 0.0542 | 0.2392 | 36.4556 | 0.9336 |
COCO [14] | 0.0407 | 0.0428 | 0.0929 | 0.2462 | 33.4367 | 0.8924 | 0.0262 | 0.0472 | 0.0977 | 0.3292 | 31.7688 | 0.8429 |
Weibo [10] | 0.0542 | 0.0649 | 0.0918 | 0.1862 | 32.0736 | 0.8424 | 0.0298 | 0.0572 | 0.1107 | 0.1937 | 31.8858 | 0.8056 |
NewsStories [28] | 0.0704 | 0.0600 | 0.1054 | 0.3373 | 34.0817 | 0.9176 | 0.0253 | 0.0535 | 0.0974 | 0.2334 | 31.4462 | 0.8160 |
Flickr8k [25] | 0.0658 | 0.0937 | 0.1078 | 0.2985 | 33.5507 | 0.9051 | 0.0130 | 0.0142 | 0.0717 | 0.1970 | 32.0187 | 0.7471 |
The low-frequency wavelet loss. We employ the low-frequency wavelet loss [12] to improve the quality and security of tagged news images. Suppose is the function of extracting low-frequency sub-bands after wavelet decomposition. The low-frequency wavlet loss is defined by
(18) |
Where, measures the difference between the low-frequency sub-bands of original news images and tagged news images.
Total loss function. The total loss function is a weighted sum of losses in the main model, including the losses in feature aware projection, the hypersphere margin loss, and the low-frequency wavelet loss, which is represented by
(19) | ||||
4 Experimental Results and Analysis
4.1 Experimental Setups
Dataset. The DIV2K training dataset [1] is used to train the proposed approach, which includes 800 high-resolution images. Before feeding the training images into the network, we first randomly crop the training images with the size of 128128. Next, we use the DIV2K test dataset to evaluate the performance, where the DIV2K test dataset consists of 100 high-resolution images. During the test process, we center-crop the test images with the size of 128128, 256256, and 10241024 to verify our method can free different sizes.
To further evaluate the generalization of the proposed approach, we use COCO 2017 [14] validation dataset, and select 1,000 images from Weibo dataset [10], NewsStories dataset [28], and Flickr8k dataset [25] separately to test the proposed method.
Setting the parameters. In the network training, we use the Adam optimizer [13] with exponential decay rates of 0.9 and 0.999. Besides, we set the batch size to 8, the initial learning rate to 0.0002, and adjust the learning rate at 12,000 iterations with a decay penalty parameter of 0.1. The total training iterations are set to 15,000. In the training process, we design the moderate manipulations including jpeg compression , blur , brightness , and contrast . The malicious manipulations include splicing . Besides, we test different settings of parameters of and in DMGM with the image size 128128. As shown in Tab. 1, when and , the difference of and are distinguishable, and thus we set and in our experiments. Our experiments are performed by PyTorch 1.7 with an NVIDIA GeForce RTX 3090 GPU.
4.2 News Images Authenticity Verification and Source Tracing Results
To evaluate the performance of the proposed approach, we conduct the following experiments.
News image authenticity verification. To detect fake news images, i.e., maliciously manipulated news images, the extracted authenticable tags from real news images should be as similar as possible to its original one and should be far away from its original one. The higher extracted error rates of and the lower extracted error rates of mean better detection performance on fake news images. The experimental results are shown in Tab. 2, it can be clearly observed that the extracted error rates of and have a large difference. The extracted error rates of indicate that nearly half of the extracted bits are inaccurately extracted. The lower extraction error rates of mean the embedded authenticable tag from real news images can be extracted accurately. The comparison of and show that the proposed approach can achieve desirable detection performance.
Unseen moderate manipulations | Unseen malicious manipulations | ||||||
Gaussian noise with =0, =0.02 | Saturation | Removal | Copy-move | ||||
0.0150 | 0.0480 | 0.0300 | 0.0450 | 0.0025 | 0.2558 | 0.0100 | 0.1850 |
Method | News source tracing | Authenticity verification | Image quality | |||
PSNR | SSIM | |||||
ISN [17] | 0.2651 | 0.5229 | 0.5013 | 0.4537 | 34.6334 | 0.9747 |
HiNet [12] | 0.2158 | 0.1725 | 0.5333 | 0.5409 | 29.0537 | 0.7984 |
MBRS [8] w/ preprocess | 0.3366 | 0.5333 | 0.4183 | 0.5333 | 35.2361 | 0.9764 |
MBRS [8] w/o preprocess | 0.0375 | 0.0458 | 0.0583 | 0.0608 | 36.2282 | 0.9817 |
TRLH [6] | 0.1983 | 0.1864 | 0.1836 | 0.1751 | 46.3329 | 0.9903 |
Ours | 0.0017 | 0.0483 | 0.0542 | 0.2392 | 36.4556 | 0.9336 |
Size of news images | DINN | FAPM | DMGM | News source tracing | Authenticity verification | Image quality | |||
PSNR | SSIM | ||||||||
0.4868 | 0.5138 | 0.2942 | 0.3083 | 19.9774 | 0.4711 | ||||
0.0741 | 0.1043 | 0.0693 | 0.0938 | 37.4220 | 0.9430 | ||||
0.0650 | 0.0870 | 0.0636 | 0.0327 | 36.0256 | 0.9223 | ||||
0.0275 | 0.0375 | 0.0558 | 0.2367 | 35.2095 | 0.9170 | ||||
0.5043 | 0.5297 | 0.2497 | 0.2647 | 19.4587 | 0.4178 | ||||
0.0647 | 0.0997 | 0.0668 | 0.0800 | 36.3945 | 0.9350 | ||||
0.0317 | 0.0475 | 0.0517 | 0.0833 | 39.1990 | 0.9620 | ||||
0.0017 | 0.0483 | 0.0542 | 0.2392 | 36.4556 | 0.9336 | ||||
0.4245 | 0.4380 | 0.4413 | 0.4390 | 20.2916 | 0.3883 | ||||
0.0402 | 0.0480 | 0.0680 | 0.0625 | 38.6636 | 0.9507 | ||||
0.1367 | 0.1567 | 0.0317 | 0.0375 | 39.7316 | 0.9606 | ||||
0.0192 | 0.0141 | 0.0342 | 0.1883 | 33.1989 | 0.8691 |
News source traceability. To trace news sources, the traceable tag should be robust to all manipulations. Therefore, lower extracted error rate means better traceability performance. As shown in Tab. 2, the extraction error rate of is slightly higher than that of , since the fake news images undergo malicious manipulations. However, it can be observed that all extraction error rates of and remain below 0.01. The extraction performance of traceable tags shows the effectiveness of news traceability.
Visual results of tagged news images. For news images, the quality of tagged images needs to be maintained. We conduct experiments to test the quality of tagged news images by using PSNR and SSIM. The results are shown in Tab. 2. For the different sizes of news images, the values of PSNR on the tagged news images are about 35 and that of SSIM are more than 0.8, which indicate the quality of tagged images maintain at a high-level.
The generalization ability on diverse datasets. We test the generalization ability of the proposed tagging approach on different datasets. As shown in Tab. 2, the extraction performance of traceable and authenticable tags is consistent with the above analysis, which can achieve news image authenticity verification and source tracing on diverse datasets.
The generalization ability on anti-unseen manipulations. We test the generalization ability of proposed tagging approach on anti-unseen manipulations. We test dual-tags on the unseen moderate manipulations, i.e., Gaussian noise with , and saturation with randomly linearly interpolating between the full RGB image and its grayscale equivalent. Also, we test the dual-tags by adding the removal and copy-move as unseen malicious manipulations. The experimental results are shown in Tab. 3. It can be observed that there is a slight increase in the extraction error rates, the extraction error rates of , , and are still much lower than that of . The extraction performance of , , and is desirable. The above observations indicate that the proposed approach has good generalizability on popular unseen manipulations.
Comparison with SoTA approaches. We compare our approach with the State-of-The-Art (SoTA) approaches, including two INN-based hiding algorithms, i.e., ISN [17] and HiNet [12], an autoencoder-based hiding approach, i.e., MBRS [8], and a dual watermarking approach, i.e., TRLH [6]. Note that ISN [17] and HiNet [12] are originally designed to embed images, we preprocess the bit messages to decimal numbers to narrow the distribution gap between the embedded data with that of images. As MBRS [8] is originally designed to embed binary bit messages, we test the performance of both embedding of the preprocessed decimal numbers and the binary bits. For TRLH [6], we keep its original settings.
The experimental results are shown in Tab. 4. It can be observed that the extraction performances of traceable tags and authenticable tags of proposed approach outperforms those of SoTA approaches under both moderate and malicious manipulations. Note that only the proposed approach shows distinguishable differences between the extraction of traceable tags and authenticable tags under malicious attacks. In summary, our approach achieves the SoTA performances for both news traceability and authenticity verification in fake news image detection. Note that the image quality of HiNet is inferior, because the solid color image-like hidden information is not considered in their training.
4.3 Ablation Study
Effect of DINN. When the classical INN in HiNet [12] is used as the baseline to verify the effect of DINN, as shown in Tab. 5, the extraction error rates of all embedded tags are very high regardless of the sizes of the news images, especially for the traceable tag, which are above 0.4. The high extraction error rates mean that it is hard to verify the news source. Besides, such close extraction error rates of and make it hard to detect fake news images. According to this phenomenon, if the network uses the same set of parameters to embed the dual-tags, it can cause conflicts and contradictions with network parameters.
By comparing the experimental results of the proposed DINN, there are low extraction error rates for , , and while high extraction error rates for . That indicates the proposed approach can achieve fake news image detection and source tracing. Besides, the image quality of the tagged news images is also improved.
Effect of FAPM. The experimental results for the comparison with/without FAPM are shown in Tab. 5. Without FAPM, tagged news images have a better performance on image quality and a decrease in the extraction error rates for all tags. However, the extract error rates of are also very low, which make it hard to distinguish it from . So, fake news images cannot be detected. By comparing the proposed approach with FAPM, the extraction error rates of have a much higher value than those of . Thus, it can be used to verify the news authenticity.
Effect of DMGM. To increase the difference between the extraction error rates of and , we design a margin loss to control the range of extraction error rates. From Tab. 5, when there is no margin loss for the distance metric to increase the difference between and , the extraction error rates of them are very close whereas the proposed DMGM creates a large margin on the extraction error rates of and . It makes fake news image detection more accurate.
5 Conclusion and Discussion
In this paper, we have presented a novel proactive invisible tagging approach for reliable fake news detection. The main purpose is to extract pre-embedded dual-tags, i.e., traceable tags and authenticable tags from news images to verify the news authenticity and trace the news source for reliable fake news detection. To achieve the above goals simultaneously, the DINN is designed to decouple the embedding and extracting processes of the dual-tags with no contradiction in network parameters. Besides, the designed FAPM and DMGM provide the dual-tags with different robustness performances and improve the tag extraction performance. In the future, we will integrate news textual information with the proposed tagging approach for fake news detection and enhance the extraction performance of tags under unseen manipulations.
References
- [1] Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 126–135, 2017.
- [2] Vishal Asnani, Xi Yin, Tal Hassner, Sijia Liu, and Xiaoming Liu. Proactive image manipulation detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15386–15395, 2022.
- [3] Christina Boididou, Stuart Middleton, Symeon Papadopoulos, Duc-Tien Dang-Nguyen, Michael Riegler, Giulia Boato, Andreas Petlund, and Ioannis Kompatsiaris. The vmu participation@ verifying multimedia use 2016. 2016.
- [4] Christina Boididou, Symeon Papadopoulos, Duc-Tien Dang-Nguyen, Giulia Boato, and Yiannis Kompatsiaris. The certh-unitn participation@ verifying multimedia use 2015. In MediaEval, 2015.
- [5] Mingxi Cheng, Shahin Nazarian, and Paul Bogdan. Vroc: Variational autoencoder-aided multi-task rumor classifier based on text. In Proceedings of the web conference 2020, pages 2892–2898, 2020.
- [6] Behrouz Bolourian Haghighi, Amir Hossein Taherinia, and Ahad Harati. Trlh: Fragile and blind dual watermarking for image tamper detection and self-recovery based on lifting wavelet transform and halftoning technique. Journal of Visual Communication and Image Representation, 50:49–64, 2018.
- [7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- [8] Zhaoyang Jia, Han Fang, and Weiming Zhang. Mbrs: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression. In Proceedings of the 29th ACM International Conference on Multimedia, pages 41–49, 2021.
- [9] Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, and Jiebo Luo. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM international conference on Multimedia, pages 795–816, 2017.
- [10] Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang, and Jiebo Luo. Multimodal fusion with recurrent neural networks for rumor detection on microblogs. In Proceedings of the 25th ACM international conference on Multimedia, pages 795–816, 2017.
- [11] Zhiwei Jin, Juan Cao, Yongdong Zhang, Jianshe Zhou, and Qi Tian. Novel visual and statistical image features for microblogs news verification. IEEE transactions on multimedia, 19(3):598–608, 2016.
- [12] Junpeng Jing, Xin Deng, Mai Xu, Jianyi Wang, and Zhenyu Guan. Hinet: deep image hiding by invertible network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4733–4742, 2021.
- [13] Zhi Li, Haoliang Li, Kwok-Yan Lam, and Alex Chichung Kot. Unseen face presentation attack detection with hypersphere loss. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2852–2856. IEEE, 2020.
- [14] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014.
- [15] Xiao-Long Liu, Chia-Chen Lin, and Shyan-Ming Yuan. Blind dual watermarking for color images’ authentication and copyright protection. IEEE Transactions on Circuits and Systems for Video Technology, 28(5):1047–1055, 2016.
- [16] Chun-Shien Lu and H-YM Liao. Multipurpose watermarking for image authentication and protection. IEEE transactions on image processing, 10(10):1579–1592, 2001.
- [17] Shao-Ping Lu, Rong Wang, Tao Zhong, and Paul L Rosin. Large-capacity image steganography based on invertible neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10816–10825, 2021.
- [18] Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J Jansen, Kam-Fai Wong, and Meeyoung Cha. Detecting rumors from microblogs with recurrent neural networks. 2016.
- [19] Jing Ma, Wei Gao, and Kam-Fai Wong. Detect rumor and stance jointly by neural multi-task learning. In Companion proceedings of the the web conference 2018, pages 585–593, 2018.
- [20] Jing Ma, Wei Gao, and Kam-Fai Wong. Detect rumors on twitter by promoting information campaigns with generative adversarial learning. In The world wide Web conference, pages 3049–3055, 2019.
- [21] Saraju P Mohanty, KR Ramakrishnan, and Mohan Kankanhalli. A dual watermarking technique for images. In Proceedings of the seventh ACM international conference on Multimedia (Part 2), pages 49–51, 1999.
- [22] Paarth Neekhara, Shehzeen Hussain, Xinqiao Zhang, Ke Huang, Julian McAuley, and Farinaz Koushanfar. Facesigns: Semi-fragile neural watermarks for media authentication and countering deepfakes. arXiv preprint arXiv:2204.01960, 2022.
- [23] Piotr Przybyla. Capturing the style of fake news. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 490–497, 2020.
- [24] Peng Qi, Juan Cao, Tianyun Yang, Junbo Guo, and Jintao Li. Exploiting multi-domain visual information for fake news detection. In 2019 IEEE international conference on data mining (ICDM), pages 518–527. IEEE, 2019.
- [25] Cyrus Rashtchian, Peter Young, Micah Hodosh, and Julia Hockenmaier. Collecting image annotations using amazon’s mechanical turk. In Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, pages 139–147, 2010.
- [26] Priyanka Singh and Suneeta Agarwal. A self recoverable dual watermarking scheme for copyright protection and integrity verification. Multimedia Tools and Applications, 76(5):6389–6428, 2017.
- [27] Shivangi Singhal, Anubha Kabra, Mohit Sharma, Rajiv Ratn Shah, Tanmoy Chakraborty, and Ponnurangam Kumaraguru. Spotfake+: A multimodal framework for fake news detection via transfer learning (student abstract). In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 13915–13916, 2020.
- [28] Reuben Tan, Bryan A Plummer, Kate Saenko, JP Lewis, Avneesh Sud, and Thomas Leung. Newsstories: Illustrating articles with visual summaries. In European Conference on Computer Vision, pages 644–661. Springer, 2022.
- [29] Matthew Tancik, Ben Mildenhall, and Ren Ng. Stegastamp: Invisible hyperlinks in physical photographs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2117–2126, 2020.
- [30] Vaibhav Vaibhav, Raghuram Mandyam Annasamy, and Eduard Hovy. Do sentence interactions matter? leveraging sentence level representations for fake news classification. arXiv preprint arXiv:1910.12203, 2019.
- [31] Ke Wu, Song Yang, and Kenny Q Zhu. False rumors detection on sina weibo by propagation structures. In 2015 IEEE 31st international conference on data engineering, pages 651–662. IEEE, 2015.
- [32] Lianwei Wu, Yuan Rao, Ling Sun, and Wangbo He. Evidence inference networks for interpretable claim verification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14058–14066, 2021.
- [33] Lianwei Wu, Yuan Rao, Ling Sun, and Wangbo He. Evidence inference networks for interpretable claim verification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 14058–14066, 2021.
- [34] Hong-Bo Xu, Rong Wang, Jia Wei, and Shao-Ping Lu. A compact neural network-based algorithm for robust image watermarking. arXiv preprint arXiv:2112.13491, 2021.
- [35] Youmin Xu, Chong Mou, Yujie Hu, Jingfen Xie, and Jian Zhang. Robust invertible image steganography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7875–7884, 2022.
- [36] Fan Yang, Yang Liu, Xiaohui Yu, and Min Yang. Automatic detection of rumor on sina weibo. In Proceedings of the ACM SIGKDD workshop on mining data semantics, pages 1–7, 2012.
- [37] Feng Yu, Qiang Liu, Shu Wu, Liang Wang, Tieniu Tan, et al. A convolutional approach for misinformation identification. In IJCAI, pages 3901–3907, 2017.
- [38] Xiaoyan Yu, Chengyou Wang, and Xiao Zhou. Review on semi-fragile watermarking algorithms for content authentication of digital images. Future Internet, 9(4):56, 2017.
- [39] Rui Zhao, Tianshan Liu, Jun Xiao, Daniel PK Lun, and Kin-Man Lam. Invertible image decolorization. IEEE Transactions on Image Processing, 30:6081–6095, 2021.
- [40] Xinyi Zhou, Jindi Wu, and Reza Zafarani. Safe: similarity-aware multi-modal fake news detection (2020). Preprint. arXiv, 200304981, 2020.
- [41] Jiren Zhu, Russell Kaplan, Justin Johnson, and Li Fei-Fei. Hidden: Hiding data with deep networks. In Proceedings of the European conference on computer vision (ECCV), pages 657–672, 2018.