This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Cycle-Interactive Generative Adversarial Network for Robust Unsupervised Low-Light Enhancement

Zhangkai Ni1, Wenhan Yang2, Hanli Wang1,∗, Shiqi Wang4, Lin Ma3, Sam Kwong4,∗ 1Department of Computer Science and Technology, Tongji University, Shanghai, China2School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore3Meituan, Beijing, China, 4Department of Computer Science, City University of Hong Kong, Hong Kong zkni, [email protected], [email protected], [email protected], shiqwang, [email protected]
(2022)
Abstract.

Getting rid of the fundamental limitations in fitting to the paired training data, recent unsupervised low-light enhancement methods excel in adjusting illumination and contrast of images. However, for unsupervised low light enhancement, the remaining noise suppression issue due to the lacking of supervision of detailed signal largely impedes the wide deployment of these methods in real-world applications. Herein, we propose a novel Cycle-Interactive Generative Adversarial Network (CIGAN) for unsupervised low-light image enhancement, which is capable of not only better transferring illumination distributions between low/normal-light images but also manipulating detailed signals between two domains, e.g., suppressing/synthesizing realistic noise in the cyclic enhancement/degradation process. In particular, the proposed low-light guided transformation feed-forwards the features of low-light images from the generator of enhancement GAN (eGAN) into the generator of degradation GAN (dGAN). With the learned information of real low-light images, dGAN can synthesize more realistic diverse illumination and contrast in low-light images. Moreover, the feature randomized perturbation module in dGAN learns to increase the feature randomness to produce diverse feature distributions, persuading the synthesized low-light images to contain realistic noise. Extensive experiments demonstrate both the superiority of the proposed method and the effectiveness of each module in CIGAN.

Low-light image enhancement, generative adversarial network (GAN); quality attention module
*Corresponding authors: Hanli Wang and Sam Kwong
journalyear: 2022copyright: acmcopyrightconference: Proceedings of the 30th ACM International Conference on Multimedia; October 10–14, 2022; Lisbon, Portugalbooktitle: Proceedings of the 30th ACM International Conference on Multimedia (MM ’22), October 10–14, 2022, Lisbon, Portugalprice: 15.00doi: XXXXXXX.XXXXXXXisbn: 978-1-4503-XXXX-X/22/10ccs: Computing methodologies Image processing; Computer vision; Unpaired image enhancement

1. Introduction

Recent years have witnessed an accelerated growth of capturing devices, enabling the ubiquitous image acquisition in various illuminace conditions. Typically, images acquired under low-light conditions inevitably degraded by various visual quality impairments, such as undesirable visibility, low contrast, and intensive noise. Low-light image enhancement aims to restore the latent normal-light image from the observed low-light one to simultaneously obtain desirable visibility, appropriate contrast, and suppressed noise (Wang et al., 2019; Yang et al., 2020). It greatly improves the quality of images to benefit human vision and can also assist in high-level computer vision tasks, such as image classification (Loh and Chan, 2019), face recognition (Jiang et al., 2021), and objection detection (Loh and Chan, 2019)etc. Pioneering low-light image enhancement methods stretch the dynamic range of low-light images, i.e. Histogram equalization (HE) (Abdullah-Al-Wadud et al., 2007; Coltuc et al., 2006; Stark, 2000; Arici et al., 2009), or adjust the decomposed illumination and reflectance layers adaptively, i.e. Retinex-based approaches (Wang et al., 2013; Fu et al., 2016b; Jobson et al., 1997; Guo et al., 2016).

Refer to caption
(a) Input
Refer to caption
(b) SICE (Cai et al., 2018)
Refer to caption
(c) ZeroDCE (Jiang et al., 2021)
Refer to caption
(d) CIGAN
Figure 1. Visual quality comparison of different methods on a real low-light image in LOL (Wei et al., 2018). SICE (Cai et al., 2018) and ZeroDCE (Guo et al., 2020) are the leading supervised and unsupervised methods, respectively. Our proposed CIGAN well restores the normal-light image with appropriate illumination and contrast as well as suppressed noise.

Recently, learning-based approaches have achieved remarkable successes (Cai et al., 2018; Chen et al., 2018a; Wang et al., 2019; Ni et al., 2020a). Most of these methods follow the paradigm of supervised learning and heavily rely on the well-prepared paired normal/low-light images to train and evaluate models. However, the commonly seen paired training datasets suffer from their respective limitations. First, synthesized data via a simplified simulated imaging pipeline (Lore et al., 2017) might fail to capture intrinsic properties of real low-light images. Second, it is quite labor-intensive and time-consuming to create manual retouching data (Wang et al., 2019; Bychkovsky et al., 2011) by expert retouchers. It also takes the risk of personal quality bias of the retouchers to adopt such kinds of data as the training data. Third, real captured data (Wei et al., 2018) might capture real degradation but fail to cover diverse scenes and objects in the wild. Besides, the ground truths captured with a pre-defined setting, i.e. the exposure time and ISO, might not be optimal. Therefore, the reliance of supervised methods on the paired data inevitably leads to the domain shift between the training data and testing data in the real world, further bringing challenges to the generalization on real low-light images.

Recently, a series of unsupervised low-light enhancement methods are proposed. These methods have no reliance on the paired training data and only require two unpaired collections of low/normal-light images. They are built based on the uni-directional generative adversarial network (GAN) (Jiang et al., 2021) or learnable curve adjustment (Guo et al., 2020). These methods achieved promising performances in illumination/contrast adjustment. However, due to the absence of supervision of detailed signal, the quality for some challenging real low-light images with intensive noise are not satisfactory. As a very similar topic, image aesthetic quality enhancement benefits from CycleGAN (Zhu et al., 2017; Chen et al., 2018b; Ni et al., 2020b) to deliver state-of-the-art performance. We argue that, these CycleGANs neither handle low-light image enhancement problem effectively. First, low-light degradation introduces information lost, which makes the enhancement problem ambiguous. In other words, the mapping between low/normal-light images is one-to-many mapping. However, CycleGAN can only lead to one-to-one discriminative mapping (Choi et al., 2018). Second, the intrinsic dimensions of low/normal-light domains are imbalanced as low-light images with intensive noise reflect more complicated properties. The imbalance might disturb the training of CycleGANs, namely that the degradation generator fails to synthesize realistic noise and subsequently the enhancement generator cannot handle the realistic degradation.

In this paper, we propose a novel Cycle Interactive GAN (CIGAN) for unsupervised low-light image enhancement to simultaneously adjust illumination, enhance contrast and suppress noise. The more comprehensive consideration of image degradation leads to more effective degradation and enhancement processes in cycle modeling. In other words, the more realistic and diverse the low-light images generated in image degradation, the better and more robust the results in image enhancement. To address the above-mentioned issues of CycleGANs, efforts have been made in three aspects. First, we make the degradation and enhancement generators in our CIGAN interact with each other. More specifically, we propose a novel low-light guided transformation to transfer the features of real low-light images from the enhancement generator to the degradation generator. With the information of different real low-light images as the reference during the whole training process, more diverse low-light images are synthesized, which is beneficial for modeling multiple mappings relationship between low/normal-low images. Second, to handle the domain imbalance issue, we incorporate a novel feature randomized perturbation in the degradation generator. The perturbation applies a learnable randomized affine transform to the intermediate features, which balances the intrinsic dimensions of the features in two domains and is beneficial for synthesizing realistic noise. Last but not least, we design a series of advanced modules to improve the modeling capacities of our CIGAN, such as a dual attention module at the generator side, a multi-scale feature pyramid at the discriminator side, a logarithmic image processing model as the fusion operation of enhancement generator. Extensive experimental results show that our method is superior to existing unsupervised methods and even the state-of-the-art supervised methods on real low-light images. To summarize, the main contributions of our paper are three-fold:

  • We propose a novel CIGAN for unsupervised low-light image enhancement to simultaneously adjust illuminationenhance contrast, and suppress noise, which excellent in image enhancement and significantly surpasses most previous works in image degradation modeling.

  • We propose a low-light guided transformation (LGT) that allows generators of degradation/enhancement to interact, which helps to generate low-light images with more diverse and realistic illumination and contrast.

  • We propose a learnable feature randomized perturbation (FRP) to produce diverse feature distributions, which makes the generated low-light images with more realistic noise and benefits the low-light image enhancement process.

Refer to caption
Figure 2. The proposed unsupervised CIGAN consists of complementary dGAN and eGAN. (i) The dGAN learns to synthesize realistic low-light images under the supervision of unpaired normal/low-light images. (ii) The eGAN restores normal-light images from synthesized low-light images under the paired supervision generated by dGAN. Our CIGAN is different from previous CycleGANs in: 1) Red dotted line: the generator of eGAN feed-forwards information of low-light features i(Il)\mathcal{E}^{i}(I_{l}) to that of dGAN via LGT to help dGAN generate more diverse and realistic illumination and contrast; 2) Blue dotted line: Random noise is injected into dGAN via Feature Randomized Perturbation (FRP) to learn to produce more diverse feature distributions for synthesizing more realistic noise. The better the synthesized low-light images, the better the eGAN that learns to enhance low-light images towards better illumination, contrast, and suppressed noise.

2. Related Work

2.1. Traditional Image Enhancement

Histogram equalization (HE). HE focuses on fitting the illumination histogram to a specific distribution according to local or global statistical characteristics (Abdullah-Al-Wadud et al., 2007; Coltuc et al., 2006; Stark, 2000; Arici et al., 2009). For example, Arici et al. (Arici et al., 2009) cast HE as an optimization problem to improve image contrast while suppressing unnatural effects. Abdullah et al. (Abdullah-Al-Wadud et al., 2007) proposed a dynamic HE technique using partition operation. Stark et al. (Stark, 2000) presented an adaptive contrast enhancement based on generalizations of HE. The main problem of HE is that it easily causes over-enhancement and noise amplification.

Retinex-based approaches. The Retinex-based method decomposes low-light images into an illumination layer and a reflectance layer to adaptively perform joint illumination adjustment and noise suppression (Wang et al., 2013; Fu et al., 2016b; Jobson et al., 1997; Guo et al., 2016). Wang et al. (Wang et al., 2013) proposed a naturalness Retinex for non-uniform illumination image enhancement. Fu et al. (Fu et al., 2016b) introduced a weighted variation Retinex that simultaneously estimates the illumination and reflectance layer. These methods have shown satisfying performance in illuminance adjustment, however, hand-crafted constraints are difficult to accurately decompose the low-light image into the illumination and reflection layers, resulting in unnatural visual effects.

2.2. Learning-based Image Enhancement

Low-light image enhancement has achieved great successes with the booming of deep learning (Lore et al., 2017; Wang et al., 2019; Jiang et al., 2021; Yang et al., 2020; Guo et al., 2020). Broadly speaking, learning-based image enhancement methods can be roughly divided into three categories according to training data: supervised (Cai et al., 2018; Lore et al., 2017; Ren et al., 2019; Wang et al., 2019), semi-supervised (Yang et al., 2020), and unsupervised (Jiang et al., 2021; Guo et al., 2020; Ni et al., 2020b; Xiong et al., 2020; Ni et al., 2020a). LLNet (Lore et al., 2017) is the first attempt to introduce deep learning into the problem of low-light image enhancement. For enhanced performance, various supervised methods are proposed by designing sophisticated network architectures and optimization objects, such as MSR-net (Shen et al., 2017), DRD (Wei et al., 2018), SICE (Cai et al., 2018), DHN (Ren et al., 2019), and UPE (Wang et al., 2019). However, these supervised methods are subject to the common restriction of highly dependent on paired data, which limits the performance of these methods on real testing data. Most recently, Yang et al. (Yang et al., 2020) proposed a semi-supervised low-light enhancement method. Jiang et al. (Jiang et al., 2021) proposed the first unsupervised model based on GAN. Guo et al. (Guo et al., 2020) adopted the no-reference optimization without paired or unpaired data. These unsupervised methods achieve promising performance in illumination adjustment, however, noise suppression has not been considered.

Refer to caption
Figure 3. The detailed structure of proposed (a) LGT, (b) FRP, and (c) DAM. The Conv and LReLU are convolution and LeakyReLU operations, respectively. The FRP used in dGAN helps synthesize realistic noise. The DAM is used in eGAN and dGAN to effectively model contextual information.

3. Method

As shown in Fig. 2, our proposed CIGAN aims to improve the perceptual quality of low-light images by simultaneously adjusting illumination, enhancing contrast and suppressing noise under the supervision of unpaired data. It consists of complementary degradation GAN (dGAN) and enhancement GAN (eGAN).

1) dGAN: It aims to synthesize a realistic low-light image I~l𝕃\tilde{I}_{l}\in\mathbb{L} (low-light image domain) from the input normal-light image InI_{n}\in\mathbb{N} (normal-light image domain) with the help of a reference low-light image Il𝕃I_{l}\in\mathbb{L}. We design two modules to synthesize more realistic low-light images with low-light illumination and contrast as well as intensive noise. As denoted by the red dotted line in Fig. 2, LGT (see Sec. 3.2-1) helps dGAN synthesize I~l\tilde{I}_{l} with the feature information of IlI_{l}, which preserves the content of InI_{n} while the introduced low-light attributes of IlI_{l} makes I~l\tilde{I}_{l} have more realistic and diverse low-light illumination and contrast. As denoted by the blue dotted line in Fig. 2, FRP (see Sec. 4-2) learns to inject random noise into features to make the feature distributions more diverse and synthesize more realistic image noise. Furthermore, an exposure assessment loss LexpL_{\text{exp}} (see Sec. 3.3-1) is adopted to keep the local average illumination of synthesized low-light images close to a low value.

2) eGAN: Conversely, it focuses on learning to recover the latent normal-light image I^n\hat{I}_{n} (I^n\hat{I}_{n}\in\mathbb{N}) from the synthesized low-light image I~l\tilde{I}_{l}. To make the generator of eGAN yield high-quality normal-light images, we design a flexible logarithmic image processing (LIP) fusion model and a dual attention module (DAM) (see Sec. 4-3).

3.1. Model Architecture

1) Generator of dGAN. Given an input normal-light image InI_{n} and a reference low-light image IlI_{l}, we adopt the pre-trained VGG-19 network (Simonyan and Zisserman, 2014) ()\mathcal{E}(\cdot) to extract their multi-scale feature representations as i(In)\mathcal{E}^{i}(I_{n}) and i(Il)\mathcal{E}^{i}(I_{l}), respectively. The LGT uses the features i(Il)\mathcal{E}^{i}(I_{l}) extracted from IlI_{l} to adaptively modulates the features i(In)\mathcal{E}^{i}(I_{n}) of InI_{n}, which helps to synthesize more diverse illumination and contrast under the guidance of various unpaired reference low-light images. The DAM is designed to capture context information from spatial and channel dimensions. The FRP learning randomly perturbs the features of decoder 𝒢L\mathcal{G}_{L} to help synthesize low-light images with realistic noise. Therefore, the synthesized low-light image I~l\tilde{I}_{l} can be expressed as:

(1) I~l=𝒢L(i(In),i(Il),𝒯i,𝒜i,𝒫i),\tilde{I}_{l}=\mathcal{G}_{L}\Big{(}\mathcal{E}^{i}(I_{n}),\mathcal{E}^{i}(I_{l}),\mathcal{T}_{i},\mathcal{A}_{i},\mathcal{P}_{i}\Big{)},

where 𝒯i\mathcal{T}_{i}, 𝒜i\mathcal{A}_{i}, and 𝒫i\mathcal{P}_{i} are LGT, DAM, and FRP at the ii-th scale, respectively. Basically, the multi-scale features i()\mathcal{E}^{i}(\cdot) is relui_1(i.e., relu1_1, relu2_1, relu3_1, relu4_1, and relu5_1, respectively), where the parameters in ()\mathcal{E}(\cdot) are fixed during the training phase.

2) Generator of eGAN. The generator of eGAN is dedicated to recovering the normal-light image I^n\hat{I}_{n} from the synthesized low-light image I~l\tilde{I}_{l}:

(2) I^n=(𝒢N(i(I~l),𝒜i),I~l).\hat{I}_{n}=\mathcal{F}\Big{(}\mathcal{G}_{N}\big{(}\mathcal{E}^{i}(\tilde{I}_{l}),\mathcal{A}_{i}\big{)},\tilde{I}_{l}\Big{)}.

where 𝒢N\mathcal{G}_{N} is the decoder of the generator of eGAN.

Different from most previous methods that subtract the output of the network from the input low-light image to obtain the final enhanced image. We propose a flexible LIP model ()\mathcal{F}(\cdot) to fuse the input I~l\tilde{I}_{l} and output I¯n\overline{I}_{n} into one image to combine information from two sources, which are formulated as follows,

(3) I^n=I~l+I¯nλ+I~lI¯n,\hat{I}_{n}=\frac{\tilde{I}_{l}+\overline{I}_{n}}{\lambda+\tilde{I}_{l}\overline{I}_{n}},

where λ\lambda is a scalar controlling the enhancement process, which is set to 1 in our work. The proposed LIP model effectively improves the stability and performance of model training (see Sec. 4.4).

3) Multi-scale Feature Pyramid Discriminator. A critical issue associated with GAN is to design a discriminator that can distinguish real/fake images based on local details and global consistency. Our solution is to design a discriminator network that can simultaneously focus on low-level texture and high-level semantic information. Therefore, we propose a multi-scale feature pyramid discriminator (MFPD) as shown in Fig. 4. The intermediate layer of the discriminator has a smaller receptive field to make the generator pay more attention to texture and local details, while the last layer has a larger receptive field to encourage the generator to ensure global consistency (Ni et al., 2020a). In short, the proposed MFPD uses multi-scale intermediate features and a pyramid scheme to guide the generators to generate images with finer local details and appreciable global consistency.

3.2. Module Design

1) Low-light Guided Transformation. We propose a novel low-light guided transformation (LGT) module to transfer the low illumination and contrast attributes of low-light images from the enhancement generator to the degradation generator, which adaptively modulate the features of normal-light images to generate low-light images with more diverse and realistic illumination and contrast. As shown in Fig. 3 (a), our LGT has two inputs at the ii-th scale: the intermediate features i(In)b×c×h×w\mathcal{E}^{i}(I_{n})\in\mathbb{R}^{b\times c\times h\times w} of the normal-light image and the intermediate features i(Il)b×c×h×w\mathcal{E}^{i}(I_{l})\in\mathbb{R}^{b\times c\times h\times w} from the reference low-light image, where bb represents the batch size, cc is the number of feature channels, hh and ww are height and width of feature, respectively. The transformation parameters w(i(Il))w\left(\mathcal{E}^{i}(I_{l})\right) and b(i(Il))b\left(\mathcal{E}^{i}(I_{l})\right) are learned from the reference features i(Il)\mathcal{E}^{i}(I_{l}) by two convolution layers, where the first convolution is shared. The modulated intermediate features ~i(In)\tilde{\mathcal{E}}^{i}(I_{n}) can be produced via affine transformation as follows:

(4) ~i(In)=i(In)w(i(Il))+b(i(Il)),\tilde{\mathcal{E}}^{i}(I_{n})=\mathcal{E}^{i}(I_{n})\odot w\big{(}\mathcal{E}^{i}(I_{l})\big{)}+b\big{(}\mathcal{E}^{i}(I_{l})\big{)},

where \odot and ++ are Hadamard element-wise product and element-wise addition, respectively.

Compared with AdaIN (Huang and Belongie, 2017) using statistical information to perform denormalization on channel-wise, our LGT processes at the element level and provides a flexible way to spatially modulate normal-light image features i(In)\mathcal{E}^{i}(I_{n}). In this way, the proposed LGT incorporates the low illumination and contrast attributes of the reference low-light image into the synthesized low-light image through element-wise affine parameters w(i(Il))w\left(\mathcal{E}^{i}(I_{l})\right) and b(i(Il))b\left(\mathcal{E}^{i}(I_{l})\right).

Refer to caption
Figure 4. The detailed structure of proposed MFPD.

2) Feature Randomized Perturbation. The LGT can effectively help to synthesize low-light images with diverse low illumination and contrast but light noise, and achieve relatively promising performance (see Sec. 4.4). However, this light noise cannot provide enough information for eGAN to learn to suppress the intensive noise of real low-light images, so that we propose the FRP to make the generator of dGAN synthesize more realistic noise. As shown in Fig. 3 (b), the scaling and shifting parameters αb×c×1×1\alpha\in\mathbb{R}^{b\times c\times 1\times 1} and βb×1×h×w\beta\in\mathbb{R}^{b\times 1\times h\times w} are sampled from the standard Gaussian distributions, then fused as:

(5) x~=(1+θ1α)x+θ2β,\tilde{x}=(1+\theta_{1}\cdot\alpha)x+\theta_{2}\cdot\beta,

where {θ1,θ2}1×c×1×1\{\theta_{1},\theta_{2}\}\in\mathbb{R}^{1\times c\times 1\times 1} are two learnable scalar weights, which are learned together with all other parameters of the network by back-propagation. As shown in Fig. 2, we embed the proposed FRP module into generator of dGAN at multiple scales to make the noise of synthesized low-light images close to real low-light images.

3) Attention Module. The DAM is proposed to model contextual information and feature recalibration. As shown in Fig. 3 (c), DAM consists of spatial attention (SA) branch and channel attention (CA) branch. Inspired by SENet (Hu et al., 2018), both SA and CA perform squeeze and excitation operations in sequence, respectively. Specifically, we use the global average pooling and global max pooling along the channel dimension to compress the feature maps in the SA branch, and adopt mean and standard deviation along the spatial dimension to squeeze the feature maps in the CA branch.

3.3. Training Objectives

1) Exposure Assessment Loss. We propose the exposure assessment loss to control the exposure consistency between the synthesized low-light images and the real ones. Our insight is to keep the average intensity of local regions of the synthesized low-light images close to a low value. Inspired by (Mertens et al., 2009), we formulate LexpL_{\text{exp}} as:

(6) Lexp=1exp((ie)22σ2),L_{\text{exp}}=1-\text{exp}\big{(}\frac{-(i-e)^{2}}{2\sigma^{2}}\big{)},

where ii is the average intensity of a local region, ee is the desired intensity, which should be close to a low value, and σ\sigma controls the smoothness of the Gaussian curve. In our work, σ\sigma, ee and the local region size are set to 0.1, 0.1, and 7×77\times 7, respectively.

2) Adversarial Loss. We adopt the relativistic average Hinge loss GAN (RaHingeGAN) loss (Jolicoeur-Martineau, 2018; Ni et al., 2020a) to guide dGAN to synthesize realistic low-light images. The RaHingeGAN loss of dGAN can be formulated as:

(7) LGL=𝔼I~l𝕃~[max(0,1(DL(I~l)𝔼Il𝕃DL(Il)))]+𝔼Il𝕃[max(0,1+(DL(Il)𝔼I~l𝕃~DL(I~l)))],LDL=𝔼I~l𝕃~[max(0,1+(DL(I~l)𝔼Il𝕃DL(Il)))]+𝔼Il𝕃[max(0,1(DL(Il)𝔼I~l𝕃~DL(I~l)))],\normalsize\begin{split}L^{L}_{\text{G}}=\mathbb{E}_{\tilde{I}_{l}\sim\mathbb{\tilde{L}}}\Big{[}\text{max}\Big{(}0,1-\big{(}D_{L}(\tilde{I}_{l})-\mathbb{E}_{I_{l}\sim\mathbb{L}}D_{L}(I_{l})\big{)}\Big{)}\Big{]}\\ +\mathbb{E}_{I_{l}\sim\mathbb{L}}\Big{[}\text{max}\Big{(}0,1+\big{(}D_{L}(I_{l})-\mathbb{E}_{\tilde{I}_{l}\sim\mathbb{\tilde{L}}}D_{L}(\tilde{I}_{l})\big{)}\Big{)}\Big{]},\\ L^{L}_{\text{D}}=\mathbb{E}_{\tilde{I}_{l}\sim\mathbb{\tilde{L}}}\Big{[}\text{max}\Big{(}0,1+\big{(}D_{L}(\tilde{I}_{l})-\mathbb{E}_{I_{l}\sim\mathbb{L}}D_{L}(I_{l})\big{)}\Big{)}\Big{]}\\ +\mathbb{E}_{I_{l}\sim\mathbb{L}}\Big{[}\text{max}\Big{(}0,1-\big{(}D_{L}(I_{l})-\mathbb{E}_{\tilde{I}_{l}\sim\mathbb{\tilde{L}}}D_{L}(\tilde{I}_{l})\big{)}\Big{)}\Big{]},\end{split}

where IlI_{l} is the real low-light image from the domain of low-light images 𝕃\mathbb{L}, and I~l\tilde{I}_{l} is the synthesized data from the domain of synthesized low-light images 𝕃~\mathbb{\tilde{L}}. Similarly, the RaHingeGAN loss of eGAN is expressed as:

(8) LGN=𝔼I~n~[max(0,1(DN(I~n)𝔼InDN(In)))]+𝔼In[max(0,1+(DN(In)𝔼I~n~DN(I~n)))],LDN=𝔼I~n~[max(0,1+(DN(I~n)𝔼InDN(In)))]+𝔼In[max(0,1(DN(In)𝔼I~n~Dn(I~n)))],\normalsize\begin{split}L^{N}_{\text{G}}=\mathbb{E}_{\tilde{I}_{n}\sim\mathbb{\tilde{N}}}\Big{[}\text{max}\Big{(}0,1-\big{(}D_{N}(\tilde{I}_{n})-\mathbb{E}_{I_{n}\sim\mathbb{N}}D_{N}(I_{n})\big{)}\Big{)}\Big{]}\\ +\mathbb{E}_{I_{n}\sim\mathbb{N}}\Big{[}\text{max}\Big{(}0,1+\big{(}D_{N}(I_{n})-\mathbb{E}_{\tilde{I}_{n}\sim\mathbb{\tilde{N}}}D_{N}(\tilde{I}_{n})\big{)}\Big{)}\Big{]},\\ L^{N}_{\text{D}}=\mathbb{E}_{\tilde{I}_{n}\sim\mathbb{\tilde{N}}}\Big{[}\text{max}\Big{(}0,1+\big{(}D_{N}(\tilde{I}_{n})-\mathbb{E}_{I_{n}\sim\mathbb{N}}D_{N}(I_{n})\big{)}\Big{)}\Big{]}\\ +\mathbb{E}_{I_{n}\sim\mathbb{N}}\Big{[}\text{max}\Big{(}0,1-\big{(}D_{N}(I_{n})-\mathbb{E}_{\tilde{I}_{n}\sim\mathbb{\tilde{N}}}D_{n}(\tilde{I}_{n})\big{)}\Big{)}\Big{]},\end{split}

where \mathbb{N} and ~\mathbb{\tilde{N}} are the real normal-light image domain and synthesized normal-light image domain, respectively.

Table 1. Quantitative comparisons of different methods on real low-light test images in LOL-Real dataset (Wei et al., 2018). EG denotes EnlightenGAN.
Metric BIME BPDHE CRM DHECE Dong EFF CLAHE LIME MF CycleGAN QAGAN
(Ying et al., 2017a) (Ibrahim and Kong, 2007) (Ying et al., 2017c) (Nakai et al., 2013) (Dong et al., 2011) (Ying et al., 2017b) (Zuiderveld, 1994) (Guo et al., 2016) (Fu et al., 2016a) (Zhang et al., 2020) (Ni et al., 2020b)
PSNR 17.85 13.84 19.64 14.64 17.26 17.85 13.13 15.24 18.73 18.80 18.97
PSNR-GC 24.72 19.55 24.92 16.31 20.57 24.72 16.60 17.19 20.98 23.48 24.43
SSIM 0.6526 0.4254 0.6623 0.4450 0.5270 0.6526 0.3709 0.4702 0.5590 0.6316 0.6081
SSIM-GC 0.7231 0.5936 0.6968 0.4521 0.5715 0.7231 0.3947 0.4905 0.5765 0.6648 0.6513
Metric MR JED RRM SRIE DRD UPE SICE UEGAN EG ZeroDCE CIGAN
(Jobson et al., 1997) (Ren et al., 2018) (Li et al., 2018) (Fu et al., 2016b) (Wei et al., 2018) (Wang et al., 2019)   (Cai et al., 2018) (Ni et al., 2020a) (Jiang et al., 2021) (Guo et al., 2020)
PSNR 11.67 17.33 17.34 14.45 15.48 13.27 19.40 19.60 18.23 18.07 19.89
PSNR-GC 18.47 22.87 23.18 23.91 23.87 24.57 23.63 23.65 21.99 23.64 26.92
SSIM 0.4269 0.6654 0.6859 0.5421 0.5672 0.4521 0.6906 0.6575 0.6165 0.6030 0.7817
SSIM-GC 0.5158 0.7236 0.7459 0.7075 0.7476 0.7051 0.7250 0.6727 0.6452 0.6739 0.8189

3) Cycle-Consistency Loss. It consists of two terms: (1) LconL_{\text{con}} calculates the L1L1 distance between the input images InI_{n} / IlI_{l} and the cycled images I^n\hat{I}_{n} / I^l\hat{I}_{l}. (2) LperL_{\text{per}} is formulated as the L2L2 norm between the feature maps of the input images and those of the cycled images, as follows:

Lcon\displaystyle L_{\text{con}} =InI^n1+IlI^l1,\displaystyle=\|I_{n}-\hat{I}_{n}\|_{1}+\|I_{l}-\hat{I}_{l}\|_{1},
(9) Lper\displaystyle L_{\text{per}} =ϕj(In)ϕj(I^n)2+ϕj(Il)ϕj(I^l)2,\displaystyle=\|\phi_{j}(I_{n})-\phi_{j}(\hat{I}_{n})\|_{2}+\|\phi_{j}(I_{l})-\phi_{j}(\hat{I}_{l})\|_{2},

where ϕj()\phi_{j}(\cdot) is the feature map of the jj-th layer of the VGG-19 network (Simonyan and Zisserman, 2014), and relu4_1 is used in our work.

Total Loss. The proposed CIGAN is optimized with the following objective,

(10) LG=LGL+LGN+λexpLexp+λconLcon+λperLper,L_{\text{G}}=L^{L}_{\text{G}}+L^{N}_{\text{G}}+\lambda_{\text{exp}}L_{\text{exp}}+\lambda_{\text{con}}L_{\text{con}}+\lambda_{\text{per}}L_{\text{per}},
(11) LD=LDL+LDN,L_{\text{D}}=L^{L}_{\text{D}}+L^{N}_{\text{D}},

where λexp\lambda_{\text{exp}}, λcon\lambda_{\text{con}}, and λper\lambda_{\text{per}} are positive constants to control the relative importance of LexpL_{\text{exp}}, LconL_{\text{con}}, and LperL_{\text{per}}, respectively.

4. Experiments

In this section, the performance of the proposed method is validated through quantitative and qualitative comparisons as well as user study.

Dataset. We follow (Yang et al., 2020) to comprehensively evaluate our proposed method on LOL dataset (Wei et al., 2018) with diverse scenes and much variability. It consists of 689 training image pairs and 100 test image pairs, all of which are captured in real-world scenarios. To meet the requirement of unpaired learning, the training set is divided into two partitions: 344 low-light images and another 345 normal-light images with no intersection with each other. Furthermore, we collect more norm/low-light images from the publicly accessible datasets to expand the training images to 1000 unpaired normal/low-light images.

Baselines. To carry out an overall comparison and evaluation, the proposed CIGAN is compared with twenty-one classical and state-of-the-art methods, including BIMEF (Ying et al., 2017a), BPDHE (Ibrahim and Kong, 2007), CRM (Ying et al., 2017c), DHECE (Nakai et al., 2013), Dong (Dong et al., 2011), EFF (Ying et al., 2017b), CLAHE (Zuiderveld, 1994), LIME (Guo et al., 2016), MF (Fu et al., 2016a), MR (Jobson et al., 1997), JED (Ren et al., 2018), RRM (Li et al., 2018), SRIE (Fu et al., 2016b), DRD (Wei et al., 2018), UPE (Wang et al., 2019), SICE (Cai et al., 2018), CycleGAN (Zhu et al., 2017), EnlightenGAN (Jiang et al., 2021), QAGAN (Ni et al., 2020b), UEGAN (Ni et al., 2020a), and ZeroDCE (Guo et al., 2020), where SICE and UPE, EnlightenGAN and ZeroDCE are the leading supervised and unsupervised methods for low-light image enhancement, respectively.

Evaluation Metrics. We follow (Cai et al., 2018; Wang et al., 2019; Yang et al., 2020) and adopt the most widely-used full-reference image quality assessment (FR-IQA) metrics: PSNR and SSIM (Wang et al., 2004). And calculated the PNSR and SSIM of the Gamma correction results (i.e., PSNR-GC and SSIM-GC). The PSNR and SSIM quantitatively compare our proposed method with other methods in terms of pixel level and structure level, respectively. The higher the values of the PSNR, SSIM, PNSR-GC, and SSIM-GC, the better the quality of the enhanced images.

Implementation Details. The network is trained for 100 epoches with a batch size of 10 and images are cropped into 224×224224\times 224 patches. The Adam (Kingma and Ba, 2014) optimizer with β1\beta_{1} = 0 and β2\beta_{2} = 0.999 is applied to optimize the network. The learning rate of the generator and the discriminator is initialized to 0.0001, the first 50 epoches are fixed and then linearly decay to zero in the next 50 epochs. The hyper-parameters LexpL_{\text{exp}}, LconL_{\text{con}}, and LperL_{\text{per}} are respectively set to 10, 10, and 1. The spectral norm (Miyato et al., 2018) is used to all layers in both generator and discriminator.

Refer to caption
(a) Input
Refer to caption
(b) BIMEF (Ying et al., 2017a)
Refer to caption
(c) DHECE (Nakai et al., 2013)
Refer to caption
(d) LIME (Guo et al., 2016)
Refer to caption
(e) MF (Fu et al., 2016a)
Refer to caption
(f) JED (Ren et al., 2018)
Refer to caption
(g) RRM (Li et al., 2018)
Refer to caption
(h) UPE (Wang et al., 2019)
Refer to caption
(i) EnlightenGAN (Jiang et al., 2021)
Refer to caption
(j) ZeroDCE (Guo et al., 2020)
Refer to caption
(k) CIGAN
Refer to caption
(l) GT
Figure 5. Visual quality comparisons of state-of-the-art enhancement methods. Upper left: original results. Lower right: the corresponding results after Gamma transformation correction for better comparison.

4.1. Quantitative Comparison

Table 1 compares the proposed CIGAN with the classical and state-of-the-art methods on LOL dataset (Wei et al., 2018). It can be observed that the proposed method outperforms all previous methods in the comparison because it consistently achieves the highest scores in terms of PSNR, PSNR-GC, SSIM, and SSIM-GC. This reveals that the proposed CIGAN is much more effective in illumination enhancement, structure restoration, and noise suppression. From Table 1, we can see that the proposed method is significantly superior to other state-of-the-art unsupervised methods (i.e., CycleGAN, EnlightenGAN, and ZeroDCE). This is because, on one hand, dGAN makes the attributes of synthesized low-light images consistent with those of real ones, and on the other hand, eGAN is able to restore high-quality normal-light images. Another interesting observation is that the proposed CIGAN even achieves better performance than leading supervised methods (i.e., DRD, UPE, and SICE) trained on a large number of paired images. It is worth noting that the larger PSNR gap between with and without Gamma correction shows that our method can effectively remove intensive noise and restore vivid details.

4.2. Qualitative Comparison

Extensive qualitative comparisons have been carried out as shown in Fig. 5 and 6. From the enhanced results, we have several insights. First, most of the existing methods (i.e., BIMEF, JED, RRM, UPE, and ZeroDCE) show poor performance in terms of illumination adjustment and detail restoration. The DHECE, LIME, MF, and EnlightenGAN are able to achieve desirable contrast adjustment. However, they also amplify noise and severely degrade visual quality. Second, although several methods especially consider noise suppression (i.e., JED and RRM), they perform unsatisfactory global contrast enhancement and remove many textures and details. In general, our proposed CIGAN achieves favorable visual quality by simultaneously realizing pleasing contrast enhancement and effective noise suppression.

Table 2. The results of pairwise comparisons in user study. Each value indicates the number of times the method in the row outperforms the method in the column.
DHECE LIME UPE SICE EG ZeroDCE CIGAN Total
(Nakai et al., 2013) (Guo et al., 2016) (Wang et al., 2019) (Cai et al., 2018) (Jiang et al., 2021) (Guo et al., 2020)
DHECE - 326 421 511 267 248 16 1789
LIME 394 - 452 563 295 269 24 1997
UPE 299 268 - 391 227 203 19 1407
SICE 209 157 329 - 171 143 12 1021
EG 453 425 493 549 - 335 125 2380
ZeroDCE 472 451 517 577 385 - 141 2543
CIGAN 704 696 701 708 595 579 - 3983
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-input.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-input-crop.png}} \end{overpic}
(a) Input
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-bimef.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-bimef-crop.png}} \end{overpic}
(b) BIMEF (Ying et al., 2017a)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-dheci.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-dheci-crop.png}} \end{overpic}
(c) DHECE (Nakai et al., 2013)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-lime.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-lime-crop.png}} \end{overpic}
(d) LIME (Guo et al., 2016)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-mf.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-mf-crop.png}} \end{overpic}
(e) MF (Fu et al., 2016a)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-jed.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-jed-crop.png}} \end{overpic}
(f) JED (Ren et al., 2018)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-rrm.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-rrm-crop.png}} \end{overpic}
(g) RRM (Li et al., 2018)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-upe.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-upe-crop.png}} \end{overpic}
(h) UPE (Wang et al., 2019)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-eg.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-eg-crop.png}} \end{overpic}
(i) EnlightenGAN (Jiang et al., 2021)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-zerodce.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-zerodce-crop.png}} \end{overpic}
(j) ZeroDCE (Guo et al., 2020)
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-cigan.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-cigan-crop.png}} \end{overpic}
(k) CIGAN
\begin{overpic}[width=122.34692pt]{illustration/group2/low00775-gt.png} \put(0.0,0.0){\includegraphics[scale={0.6}]{illustration/group2/low00775-gt-crop.png}} \end{overpic}
(l) GT
Figure 6. The visual quality comparison for a close-up region of state-of-the-art enhancement methods.
Refer to caption
(a) Input
Refer to caption
(b) w/o FRP
Refer to caption
(c) w/o LexpL_{\text{exp}}
Refer to caption
(d) w/o LGT
Refer to caption
(e) w/o all
Refer to caption
(f) CIGAN
Refer to caption
(g) GT
Figure 7. Ablation study of the effectiveness of three key components (i.e., FRP, LGT, and LexpL_{\text{exp}}) in our proposed CIGAN. The (e) w/o all means the CIGAN without FRP, LGT, and LexpL_{\text{exp}} that very similar to vanilla CycleGAN.

4.3. User Study

To study how users prefer the enhanced results of each method, we perform a user study with 24 participants and 30 images of seven methods using pairwise comparisons. Each time the participants are randomly present with the enhanced results of two different methods of the same test image, they are then asked to select their favorite result from the two presented images. Table 2 tabulates the results of the pairwise comparison, from which we can observe that the enhanced results of the proposed CIGAN are more favorite with users because CIGAN is selected more frequently than the comparison methods. This is consistent with the quantitative and qualitative results, and further consolidates the conclusion that the proposed CIGAN is superior to the state-of-the-art methods.

Table 3. Comparison of average PSNR and SSIM performance of different variants of our method on LOL Dataset (Wei et al., 2018).
Method PSNR PSNR-GC SSIM SSIM-GC
CIGAN w/o LGT 18.85 25.21 0.7488 0.7882
CIGAN w/o FRP 18.24 23.71 0.7178 0.7573
CIGAN w/o DAM 19.25 26.28 0.7644 0.8060
CIGAN w/o MFPD 19.55 26.36 0.7711 0.8083
CIGAN w/o LIP 19.57 24.71 0.7641 0.7969
CIGAN w/o LexpL_{\text{exp}} 19.75 24.86 0.7266 0.7739
CIGAN 19.89 26.92 0.7817 0.8189

4.4. Ablation Study

We conduct extensive ablation studies to quantitatively evaluate the effectiveness of each component in our proposed CIGAN. The variant of CIGAN w/o LexpL_{\text{exp}} replaces the proposed LIP-based fusion by subtracting the network output from the input low-light image. We perform an ablation analysis on the real low-light image in Fig. 7. It can be observed that the result produced by CIGAN is obviously better than its variants. Table 3 lists the performance of different variants of our proposed CIGAN on 100 testing images in the LOL dataset in terms of average PSNR, PSNR-GC, SSIM, and SSIM-GC. From Table 3, we want to emphasize three key components. First, it is critical for LGT to adaptively modulate the normal-light image features with low-light image features. Without it, generator 𝒢L\mathcal{G}_{L} fails to learn domain-specific properties directly from low-light images, which results in a significant performance degradation. Second, removing the FRP that encourages the synthesis of low-light images with realistic noise, which also leads to a striking performance gap. Last, the proposed exposure assessment loss LexpL_{\text{exp}} plays a key role in synthesizing realistic low-light images that keeps contrast consistent with the real low-light images. All the proposed components lead to better performance, and considering them together enables CIGAN to further improve the quantitative performance towards the best.

5. Conclusions

This paper aims to improve the perceptual quality of real low-light images using unpaired data only in an unsupervised manner. To this end, we propose a novel unsupervised CIGAN, which contains three elaborately designed components: (1) LGT module adaptively modulates normal-light image features with low-light image features to synthesize more diverse low-light images; (2) FRP module encourages the synthesis of low-light images with realistic noise; (3) MFPD improves image quality from coarse-to-fine. Finally, a novel exposure assessment loss is formulated to control the exposure of synthesized low-light images and attention mechanisms are adopted to further improve the image quality. Extensive experiments on real-world low-light images show that our method achieves the superior performance in both quantitative and qualitative evaluations.

Acknowledgements.
The authors would like to thank the anonymous referees for their insightful comments and suggestions. This work was supported in part by National Natural Science Foundation of China under Grant 61976159, Shanghai Innovation Action Project of Science and Technology under Grant 20511100700.

References

  • (1)
  • Abdullah-Al-Wadud et al. (2007) Mohammad Abdullah-Al-Wadud, Md Hasanul Kabir, M Ali Akber Dewan, and Oksam Chae. 2007. A dynamic histogram equalization for image contrast enhancement. IEEE Transactions on Consumer Electronics 53, 2 (2007), 593–600.
  • Arici et al. (2009) Tarik Arici, Salih Dikbas, and Yucel Altunbasak. 2009. A histogram modification framework and its application for image contrast enhancement. IEEE Transactions on image processing 18, 9 (2009), 1921–1935.
  • Bychkovsky et al. (2011) Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. 2011. Learning photographic global tonal adjustment with a database of input/output image pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 97–104.
  • Cai et al. (2018) Jianrui Cai, Shuhang Gu, and Lei Zhang. 2018. Learning a deep single image contrast enhancer from multi-exposure images. IEEE Transactions on Image Processing 27, 4 (2018), 2049–2062.
  • Chen et al. (2018a) Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun. 2018a. Learning to see in the dark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3291–3300.
  • Chen et al. (2018b) Yu-Sheng Chen, Yu-Ching Wang, Man-Hsin Kao, and Yung-Yu Chuang. 2018b. Deep photo enhancer: Unpaired learning for image enhancement from photographs with gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6306–6314.
  • Choi et al. (2018) Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8789–8797.
  • Coltuc et al. (2006) Dinu Coltuc, Philippe Bolon, and J-M Chassery. 2006. Exact histogram specification. IEEE Transactions on Image Processing 15, 5 (2006), 1143–1152.
  • Dong et al. (2011) Xuan Dong, Guan Wang, Yi Pang, Weixin Li, Jiangtao Wen, Wei Meng, and Yao Lu. 2011. Fast efficient algorithm for enhancement of low lighting video. In Proceedings of the IEEE International Conference on Multimedia and Expo. 1–6.
  • Fu et al. (2016a) Xueyang Fu, Delu Zeng, Yue Huang, Yinghao Liao, Xinghao Ding, and John Paisley. 2016a. A fusion-based enhancing method for weakly illuminated images. Signal Processing 129 (2016), 82–96.
  • Fu et al. (2016b) Xueyang Fu, Delu Zeng, Yue Huang, Xiao-Ping Zhang, and Xinghao Ding. 2016b. A weighted variational model for simultaneous reflectance and illumination estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2782–2790.
  • Guo et al. (2020) Chunle Guo, Chongyi Li, Jichang Guo, Chen Change Loy, Junhui Hou, Sam Kwong, and Runmin Cong. 2020. Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1780–1789.
  • Guo et al. (2016) Xiaojie Guo, Yu Li, and Haibin Ling. 2016. LIME: Low-light image enhancement via illumination map estimation. IEEE Transactions on image processing 26, 2 (2016), 982–993.
  • Hu et al. (2018) Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132–7141.
  • Huang and Belongie (2017) Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision. 1501–1510.
  • Ibrahim and Kong (2007) Haidi Ibrahim and Nicholas Sia Pik Kong. 2007. Brightness preserving dynamic histogram equalization for image contrast enhancement. IEEE Transactions on Consumer Electronics 53, 4 (2007), 1752–1758.
  • Jiang et al. (2021) Yifan Jiang, Xinyu Gong, Ding Liu, Yu Cheng, Chen Fang, Xiaohui Shen, Jianchao Yang, Pan Zhou, and Zhangyang Wang. 2021. EnlightenGAN: Deep Light Enhancement Without Paired Supervision. IEEE transactions on image processing 30 (2021), 2340–2349.
  • Jobson et al. (1997) Daniel J Jobson, Zia-ur Rahman, and Glenn A Woodell. 1997. A multiscale retinex for bridging the gap between color images and the human observation of scenes. IEEE Transactions on Image processing 6, 7 (1997), 965–976.
  • Jolicoeur-Martineau (2018) Alexia Jolicoeur-Martineau. 2018. The relativistic discriminator: a key element missing from standard GAN. arXiv preprint arXiv:1807.00734 (2018).
  • Kingma and Ba (2014) Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
  • Li et al. (2018) Mading Li, Jiaying Liu, Wenhan Yang, Xiaoyan Sun, and Zongming Guo. 2018. Structure-revealing low-light image enhancement via robust retinex model. IEEE Transactions on Image Processing 27, 6 (2018), 2828–2841.
  • Loh and Chan (2019) Yuen Peng Loh and Chee Seng Chan. 2019. Getting to know low-light images with the exclusively dark dataset. Computer Vision and Image Understanding 178 (2019), 30–42.
  • Lore et al. (2017) Kin Gwn Lore, Adedotun Akintayo, and Soumik Sarkar. 2017. LLNet: A deep autoencoder approach to natural low-light image enhancement. Pattern Recognition 61 (2017), 650–662.
  • Mertens et al. (2009) Tom Mertens, Jan Kautz, and Frank Van Reeth. 2009. Exposure fusion: A simple and practical alternative to high dynamic range photography. In Computer graphics forum, Vol. 28. Wiley Online Library, 161–171.
  • Miyato et al. (2018) Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018).
  • Nakai et al. (2013) Keita Nakai, Yoshikatsu Hoshi, and Akira Taguchi. 2013. Color image contrast enhacement method based on differential intensity/saturation gray-levels histograms. In Proceedings of the International Symposium on Intelligent Signal Processing and Communication Systems. 445–449.
  • Ni et al. (2020a) Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, and Sam Kwong. 2020a. Towards Unsupervised Deep Image Enhancement With Generative Adversarial Network. IEEE Transactions on Image Processing 29 (2020), 9140–9151.
  • Ni et al. (2020b) Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, and Sam Kwong. 2020b. Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network. In Proceedings of the 28th ACM International Conference on Multimedia. 1697–1705.
  • Ren et al. (2019) Wenqi Ren, Sifei Liu, Lin Ma, Qianqian Xu, Xiangyu Xu, Xiaochun Cao, Junping Du, and Ming-Hsuan Yang. 2019. Low-light image enhancement via a deep hybrid network. IEEE Transactions on Image Processing 28, 9 (2019), 4364–4375.
  • Ren et al. (2018) Xutong Ren, Mading Li, Wen-Huang Cheng, and Jiaying Liu. 2018. Joint enhancement and denoising method via sequential decomposition. In Proceedings of the IEEE International Symposium on Circuits and Systems. 1–5.
  • Shen et al. (2017) Liang Shen, Zihan Yue, Fan Feng, Quan Chen, Shihao Liu, and Jie Ma. 2017. MSR-net: Low-light image enhancement using deep convolutional network. arXiv preprint arXiv:1711.02488 (2017).
  • Simonyan and Zisserman (2014) Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
  • Stark (2000) J Alex Stark. 2000. Adaptive image contrast enhancement using generalizations of histogram equalization. IEEE Transactions on image processing 9, 5 (2000), 889–896.
  • Wang et al. (2019) Ruixing Wang, Qing Zhang, Chi-Wing Fu, Xiaoyong Shen, Wei-Shi Zheng, and Jiaya Jia. 2019. Underexposed photo enhancement using deep illumination estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6849–6857.
  • Wang et al. (2013) Shuhang Wang, Jin Zheng, Hai-Miao Hu, and Bo Li. 2013. Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE Transactions on Image Processing 22, 9 (2013), 3538–3548.
  • Wang et al. (2004) Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600–612.
  • Wei et al. (2018) Chen Wei, Wenjing Wang, Wenhan Yang, and Jiaying Liu. 2018. Deep Retinex Decomposition for Low-Light Enhancement. In Proceedings of the British Machine Vision Conference. 1–11.
  • Xiong et al. (2020) Wei Xiong, Ding Liu, Xiaohui Shen, Chen Fang, and Jiebo Luo. 2020. Unsupervised Real-world Low-light Image Enhancement with Decoupled Networks. arXiv preprint arXiv:2005.02818 (2020).
  • Yang et al. (2020) Wenhan Yang, Shiqi Wang, Yuming Fang, Yue Wang, and Jiaying Liu. 2020. From fidelity to perceptual quality: A semi-supervised approach for low-light image enhancement. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3063–3072.
  • Ying et al. (2017a) Zhenqiang Ying, Ge Li, and Wen Gao. 2017a. A bio-inspired multi-exposure fusion framework for low-light image enhancement. arXiv preprint arXiv:1711.00591 (2017).
  • Ying et al. (2017b) Zhenqiang Ying, Ge Li, Yurui Ren, Ronggang Wang, and Wenmin Wang. 2017b. A new image contrast enhancement algorithm using exposure fusion framework. In International Conference on Computer Analysis of Images and Patterns. Springer, 36–46.
  • Ying et al. (2017c) Zhenqiang Ying, Ge Li, Yurui Ren, Ronggang Wang, and Wenmin Wang. 2017c. A new low-light image enhancement algorithm using camera response model. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 3015–3022.
  • Zhang et al. (2020) Yu Zhang, Xiaoguang Di, Bin Zhang, Ruihang Ji, and Chunhui Wang. 2020. Better Than Reference In Low Light Image Enhancement: Conditional Re-Enhancement Networks. arXiv preprint arXiv:2008.11434 (2020).
  • Zhu et al. (2017) Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223–2232.
  • Zuiderveld (1994) Karel Zuiderveld. 1994. Contrast limited adaptive histogram equalization. Graphics gems (1994), 474–485.