Pyramid Texture Filtering

Qing Zhang 0000-0001-5312-2800 Sun Yat-sen UniversityGuangzhouChina [email protected] , Hao Jiang 0009-0002-6398-2323 Sun Yat-sen UniversityGuangzhouChina [email protected] , Yongwei Nie 0000-0002-8922-3205 South China University of TechnologyGuangzhouChina [email protected] and Wei-Shi Zheng 0000-0001-8327-0003 Sun Yat-sen UniversityGuangzhouChina [email protected]

Abstract.

We present a simple but effective technique to smooth out textures while preserving the prominent structures. Our method is built upon a key observation—the coarsest level in a Gaussian pyramid often naturally eliminates textures and summarizes the main image structures. This inspires our central idea for texture filtering, which is to progressively upsample the very low-resolution coarsest Gaussian pyramid level to a full-resolution texture smoothing result with well-preserved structures, under the guidance of each fine-scale Gaussian pyramid level and its associated Laplacian pyramid level. We show that our approach is effective to separate structure from texture of different scales, local contrasts, and forms, without degrading structures or introducing visual artifacts. We also demonstrate the applicability of our method on various applications including detail enhancement, image abstraction, HDR tone mapping, inverse halftoning, and LDR image enhancement. Code is available at https://rewindl.github.io/pyramid_texture_filtering/.

image smoothing, structure extraction, image decomposition, image pyramid, upsampling

^†^†journal: TOG^†^†journalvolume: 42^†^†journalnumber: 4^†^†article: 1^†^†publicationmonth: 8^†^†price: 15.00^†^†ccs: Computing methodologies Image manipulation

Refer to caption — Figure 1. We demonstrate texture filtering (also referred to as structure-preserving filtering) based on Gaussian and Laplacian pyramids, which, unlike previous work, does not rely on any explicit measures to distinguish texture from structure, but can effectively deal with diverse types of textures. Top: input images. Bottom: our results. The 2nd-5th images are from TrishBurr, Simona Proto Art, (Xu et al., 2012), and Flickr user Kyle Eroche.

1. Introduction

Texture smoothing is a fundamental problem in computational photography and image analysis. It aims to remove the distracting fine-scale textures while maintaining the large-scale image structures. This smoothing operation has received considerable research attention, since it not only benefits scene understanding but also enables a wide variety of image manipulation applications such as image abstraction, detail enhancement, and HDR tone mapping.

Despite being widely studied, texture smoothing remains a challenge as it requires to distinguish texture from structure, which is a non-trivial problem. The reason is that texture and structure are often very similar in terms of basic visual elements such as intensity, gradient, and local contrast, making the two indistinguishable. It is notable that, although closely related and conceptually similar, edge-preserving smoothing approaches (Tomasi and Manduchi, 1998; Farbman et al., 2008; Fattal, 2009; Kass and Solomon, 2010; Paris et al., 2011; Xu et al., 2011; Gastal and Oliveira, 2011; Bi et al., 2015; Liu et al., 2021) are not optimal solutions for texture smoothing, because their goal is to retain salient edges regardless of whether they are originated from texture or structure components.

Previous texture smoothing methods primarily formulate filtering or optimization frameworks based on hand-crafted texture-structure separation measures (Subr et al., 2009; Xu et al., 2012; Karacan et al., 2013; Bao et al., 2013; Cho et al., 2014; Zhang et al., 2015). However, they may struggle when presented with textures that cannot be effectively distinguished by their measures, or tend to remove textures at the cost of degrading image structures. More recently, some learning-based approaches that enable texture removal have also been proposed (Lu et al., 2018; Fan et al., 2018), but they may not generalize well to textures not present in their training datasets.

In this paper, we present a novel texture smoothing method that can effectively remove different types of textures while preserving the main structures. Unlike previous methods that focus on designing texture-structure separation measures based on certain local image statistics, we argue that the most discriminative difference between texture and structure is the scale, since we observe that the coarsest Gaussian pyramid level representing large-scale image information often naturally eliminate textures without destroying structures (see Figure 2). This observation inspires us to perform texture smoothing by upsampling the coarsest Gaussian pyramid level, which is, however, cannot accomplished by existing upsampling methods as they are unable to produce result with sharp structures from a very low-resolution input (see Figure 4(b) and (c)). The key to the success of our method is the proposed pyramid-guided structure-aware upsampling, which is iteratively performed under the guidance of each fine-scale Gaussian pyramid level and its associated Laplacian pyramid level, until the coarsest level is upsampled to a full-resolution texture smoothing result.

As shown in Figure 1, our method, while very simple in key idea, is surprisingly effective in texture removal and can faithfully preserve image structures without introducing visual artifacts. Besides, it is very easy to implement and allows to rapidly produce results. We also demonstrate that it enables various image manipulation applications. Our main contributions are summarized as follows:

•

We find that the coarsest level in a Gaussian pyramid usually eliminates textures while preserving the main image structures, providing new cues for texture smoothing.
•

We present a novel pyramid-based texture smoothing approach based on the above finding, which works by progressively upsampling the coarsest Gaussian pyramid level of a given image to the original full-resolution, without relying on any texture-structure separation measures.
•

We develop pyramid-guided structure-aware upsampling to produce texture smoothing result with sharp structures from the coarsest Gaussian pyramid level in very low-resolution.

2. Related Work

2.1. Edge-Preserving Image Smoothing

The past decades have witnessed a wealth of work on edge-preserving smoothing. Among them, a large number of methods belong to the category of local filtering, the core idea of which is to perform weighted average over a local spatial neighborhood. Popular representatives in this category include anisotropic diffusion (Perona and Malik, 1990), bilateral filter (Tomasi and Manduchi, 1998) and its fast approximations (Durand and Dorsey, 2002; Paris and Durand, 2006; Weiss, 2006; Chen et al., 2007), edge-avoiding wavelets (Fattal, 2009), local histogram filter (Kass and Solomon, 2010), geodesic filters (Criminisi et al., 2010; Gastal and Oliveira, 2011, 2012), and local Laplacian filer (LLF) (Paris et al., 2011). Note that although both our method and LLF utilize image pyramids, we aim to achieve texture smoothing by upsampling the coarsest level in Gaussian pyramid, while the goal of LLF is to enable edge-aware filtering through manipulation of Laplacian pyramid coefficients.

Optimization-based methods, such as weighted least squares (WLS) (Farbman et al., 2008) and its more efficient variants (Min et al., 2014; Liu et al., 2017, 2020), $L_{0}$ gradient minimization (Xu et al., 2011), $L_{1}$ image transform (Bi et al., 2015), are also commonly adopted solutions for edge-preserving smoothing. However, most of them involve relatively high computational complexity because of requiring solving large-scale linear systems, which limits their scalability and increases the difficulty of implementation.

Recently, deep learning has also been introduced to the field of edge-aware smoothing (Xu et al., 2015; Liu et al., 2016; Fan et al., 2018). As it is very difficult to obtain ground-truth smoothing results, current deep-learning-based methods either use results produced by existing smoothing algorithms as supervision, or apply deep network as an optimization solution.

2.2. Texture Smoothing

Despite the great success of edge-preserving smoothing research, texture smoothing was also advocated and widely studied, since the overall image structures, rather than fine-scale textures, are assumed to be crucial to visual perception. As the major difficulty of the problem lies in how to effectively distinguish texture from structure, most existing methods are built upon specially designed texture-structure separation measures, e.g., local extrema (Subr et al., 2009), relative total variation (Xu et al., 2012; Cho et al., 2014), region covariance based patch similarity (Karacan et al., 2013; Zhu et al., 2016), minimum spanning tree (Bao et al., 2013), and segment graph (Zhang et al., 2015), or rely on auxiliary contour information as guidance (Wei et al., 2018). Besides, there are some scale-aware texture smoothing methods (Zhang et al., 2014; Du et al., 2016; Jeon et al., 2016) based on Gaussian filtering, while more recent works are mostly deep learning based (Lu et al., 2018; Kim et al., 2018).

Our work is complementary to existing texture smoothing methods in three aspects. First, we reveal that the multi-scale representations provided by standard image pyramids can be effective cues for texture smoothing. Second, we present a novel pyramid-based texture smoothing approach that neither relies on any texture-structure separation measures, nor requires additional contour information as input. Third, we show that our approach outperforms previous methods in terms of structure preservation and texture removal, and is effective to deal with previously challenging images with large-scale or high-contrast textures.

2.3. Image Pyramids

Image pyramids are useful multi-resolution representations for analyzing and manipulating images over a range of spatial scales (Burt and Adelson, 1983). Here we briefly describe Gaussian and Laplacian pyramids as we build our approach on top of them. Given an image $I$ , its Gaussian pyramid is a set of low resolution versions $\{G_{\ell}\}$ (called levels) of the image in which small-scale image details gradually disappear. To construct $\{G_{\ell}\}$ , the original image $I$ , i.e., the finest level $G_{0}$ , is repeatedly smoothed by a Gaussian filter and then subsampled to generate the sequence of levels, until a minimum resolution is reached at the coarsest level $G_{N}$ . Laplacian pyramid is created from the Gaussian pyramid, with the goal of capturing image details that are present in one Gaussian pyramid level, but not present at the following coarser level. Specifically, levels of the Laplacian pyramid are defined as the differences between the successive levels of the Gaussian pyramid, $L_{\ell}=G_{\ell}-\textrm{upsample}(G_{\ell+1})$ , where $\textrm{upsample}(\cdot)$ is an operation that upscales the resolution of a Gaussian pyramid level to that of its previous finer level. The last level $L_{N}$ in the Laplacian pyramid is not a difference image, but exactly $G_{N}$ , based on which the original image $I$ can be reconstructed by recursively applying $G_{\ell}=L_{\ell}+\textrm{upsample}(G_{\ell+1})$ to collapse the Laplacian pyramid.

3. Main Observation

Figure 2 illustrates the observation that motivates the proposal of our method. It can be observed that as the scale of the Gaussian pyramid level becomes coarser, small-scale textures are gradually eliminated, finally yielding a structure-only coarsest Gaussian pyramid level. This inherent scale-separation property of Gaussian pyramid inspires us that it is feasible to perform texture smoothing by upsampling the coarsest Gaussian pyramid level to the original full-resolution. However, since the coarsest level in a Gaussian pyramid usually has very low-resolution, existing image upsampling strategies, e.g., bilinear upsampling and joint bilateral upsampling (Kopf et al., 2007), are insufficient to produce high-quality texture smoothing results with sharp structures, as shown in Figure 4(b) and (c). To this end, we develop a novel pyramid-based upsampling approach, which will be introduced in the next section.

4. Approach

Given an input image $I$ , and its Gaussian and Laplacian pyramids $\{G_{\ell}\}$ and $\{L_{\ell}\}$ with $\ell=N$ indicates the coarsest scale and $\ell=0$ corresponds to the finest scale ( $G_{0}=I$ ), our method aims to produce a texture filtered image $R$ , by upsampling the coarsest Gaussian pyramid level $G_{N}$ to the original full-resolution. Figure 3 presents the overview of our approach. As shown, at the core of our method is the proposed pyramid-guided structure-aware upsampling, which is employed to iteratively upsample $G_{N}$ to a series of intermediate texture smoothing images $R_{k}$ at different scales $k$ ( $0\leq k\leq N-1$ ), until a full-resolution texture smoothing output $R_{0}$ ( $R_{0}=R$ ) with finest structure obtained. In the following, we first introduce the pyramid-guided structure-aware upsampling, and then elaborate the implementation details of our approach.

4.1. Pyramid-Guided Structure-Aware Upsampling

Background

We first summarize the joint bilateral filter (JBF) (Petschnigg et al., 2004; Eisemann and Durand, 2004), as we built our pyramid-guided structure-aware upsampling on top of it. Given an input image $I$ , the joint bilateral filter computes an output image by replacing $I$ with a guidance image $\tilde{I}$ in range filter kernel of the bilateral filter (Tomasi and Manduchi, 1998), which is expressed as:

(1)

\begin{split}\textrm{JBF}(I,\tilde{I})_{p}&=\frac{1}{K_{p}}\sum_{q\in\Omega^{d}_{p}}g_{\sigma_{s}}(\|p-q\|)~{}g_{\sigma_{r}}(\|\tilde{I}_{p}-\tilde{I}_{q}\|)~{}I_{q},\\ \textrm{with}\enspace K_{p}&=\sum_{q\in\Omega^{d}_{p}}g_{\sigma_{s}}(\|p-q\|)~{}g_{\sigma_{r}}(\|\tilde{I}_{p}-\tilde{I}_{q}\|),\end{split}

where $p$ and $q$ are pixel coordinates. $\Omega^{d}_{p}$ is a set of pixels in the $d\times d$ squared neighborhood centered at $p$ . $g_{\sigma}(x)=\exp(-x^{2}/2\sigma^{2})$ is a Gaussian kernel function with standard deviation $\sigma$ . $\sigma_{s}$ and $\sigma_{r}$ control the spatial support and the sensitivity to edges, respectively. A well-known variant with similar formulation to Eq. (1) is joint bilateral upsampling (denoted as JBF^↑ in paper) (Kopf et al., 2007), which is designed to upsample a low-resolution image under the guidance of a high-resolution image in an edge-aware manner.

Our upsampling

The dashed box in Figure 3 gives the workflow of our pyramid-guided structure-aware upsampling. As can be seen, its goal is to upsample an intermediate texture smoothing image $R_{k}$ at scale $k$ ( $0<k\leq N$ ) to a structure-refined fine-scale output $R_{k-1}$ , with the aid of the Gaussian and Laplacian pyramid levels $G_{k-1}$ and $L_{k-1}$ . Below we describe our upsampling algorithm in detail.

To produce $R_{k-1}$ from $R_{k}$ , we start by upsampling $R_{k}$ to an initial output $\hat{R}_{k-1}$ at scale $k-1$ by performing joint bilateral upsampling using the Gaussian pyramid level $G_{k-1}$ as guidance:

(2)

\hat{R}_{k-1}=\textrm{JBF}^{\uparrow}(R_{k},G_{k-1}).

The process aims to make the upsampling output $\hat{R}_{k-1}$ share similar structure edges with $G_{k-1}$ that contains finer-scale structures. However, as shown in Figure 3, the structure sharpness of $\hat{R}_{k-1}$ is still inferior to that of $G_{k-1}$ , indicating that $\hat{R}_{k-1}$ loses certain amount of structure details compared to $G_{k-1}$ , and iteratively performing the process in Eq. (2) will fail to generate texture smoothing output with sharp structures (see Figure 4(d)). Note, the above process will not introduce textures to the output (i.e., $\hat{R}_{k-1}$ is texture-free), as shown in Figure 3 and Figure 4(d). The reason is twofold. First, the starting input of the upsampling process is the coarsest level $G_{N}$ that barely contains textures. Second, as will be introduced later, we set a small neighborhood $d$ for the process to avoid mistaking texture details in the guidance image $G_{k-1}$ as structures.

To further refine structures and obtain the resulting $R_{k-1}$ with sharper structures than $\hat{R}_{k-1}$ , we introduce Laplacian pyramid into our method because it records exactly the image details that each Gaussian pyramid level loses compared to its next coarser level. Specifically, we propose to compute $R_{k-1}$ by smoothing out the unwanted texture details from the sum of $\hat{R}_{k-1}$ and $L_{k-1}$ via joint bilateral filtering with $\hat{R}_{k-1}$ as guidance:

(3)

R_{k-1}=\textrm{JBF}(\hat{R}_{k-1}+L_{k-1},\hat{R}_{k-1}).

The reason to use $\hat{R}_{k-1}$ as guidance is because it is texture-free and has similar structure edges to the sum of $\hat{R}_{k-1}$ and $L_{k-1}$ , allowing effective texture removal and faithfully maintaining the desired structure residuals carried by $L_{k-1}$ (see Figure 3).

Input: image

I

, parameters

{\sigma}_{s}

and

{\sigma}_{r}

Output: texture smoothing image

R

build Gaussian and Laplacian pyramids

\{G_{\ell}\}

and

\{L_{\ell}\}

for image

I

R_{N}\leftarrow

the coarsest level

G_{N}

for $k=N-1:0$ do

\hat{R}_{k}=\textrm{JBF}^{\uparrow}(R_{k+1},G_{k})

R_{k}=\textrm{JBF}(\hat{R}_{k}+L_{k},\hat{R}_{k})

end for

R\leftarrow R_{0}

ALGORITHM 1 Pyramid Texture Filtering

Figure 5 shows how the texture smoothing output varies with the iterative upsampling, where we can see that the image structures are gradually refined along with the upsampling, without introducing textures. As can be seen, although structures in the input image are not properly aligned with the severely blurred structures in the coarsest level $R_{4}$ , especially the white wooden fence window, our method is also able to produce a high-quality texture smoothing output $R_{0}$ with structures almost as sharp as those in the input image. The reason is that our method does not directly deal with the severe structure misalignment between the original image and the coarsest level, but instead iteratively performs pyramid-guided structure-aware upsampling to gradually refine structures, which is obviously easier and more reliable because the structure misalignment between two adjacent pyramid levels is much weak. This also explains why naively upsampling the coarsest level through joint bilateral upsampling with the original image as guidance fails to generate result with sharp structures, as shown in Figure 4(c).

Parameters

As suggested by Eq. (1), the success of our method is also related to the joint bilateral filtering parameters in Eqs. (2) and (3) in our upsampling at each scale, i.e., the spatial and range smoothing parameters $\sigma_{s}$ and $\sigma_{r}$ as well as the neighborhood size $d$ . Note that although our upsampling at each scale involves six parameters, there are actually only two parameters, $\sigma_{s}$ and $\sigma_{r}$ , to control throughout the entire iterative upsampling, as all other parameters can be adaptively adjusted according to the two parameters.

Specifically, we fix all the range parameters to $\sigma_{r}$ to ensure that image edges across different scales are always treated the same. Suppose that $\sigma_{s,k}$ denotes the spatial parameter utilized in both Eqs. (2) and (3) in our upsampling at scale $k$ ( $0\leq k<N$ ) with $R_{k}$ as output, at the finest scale $k=0$ we set $\sigma_{s,0}=\sigma_{s}$ . The spatial parameters of the subsequent coarse-scale ( $k\geq 1$ ) upsampling are then set as $\sigma_{s,k}=\sigma_{s,0}/2^{k}$ to adapt to scale changes. To strengthen structure refinement while maintaining good texture removal effect, we empirically set the neighborhood sizes (no less than $3\times 3$ ) in Eqs. (2) and (3) in our upsampling at scale $k$ as the odd values closest to $\max(\sigma_{s,k},3)$ and $\max(4\sigma_{s,k},3)$ , respectively. Setting a small neighborhood for Eq. (2) is to reduce the number of pixels involved in the weighted average and avoid bringing back textures from the guided Gaussian pyramid level at a finer scale, while setting a large neighborhood for Eq. (3) aims to ensure that textures introduced by the Laplacian pyramid level are effectively removed. Note, as the guidance image $\hat{R}_{k}$ in Eq. (3) is texture-free, a large neighborhood here will not recover unwanted textures as in Eq. (2).

4.2. Implementation

The implementation of our approach is summarized in Algorithm 1. We build standard image pyramids with a downsampling rate of 1/2 based on Gaussian smoothing with $5\times 5$ kernel and a standard deviation of 1, and empirically limit the long axis of the coarsest Gaussian pyramid level to [32, 64) to set the pyramid depth. The down- and up-sampling operations in pyramid construction are achieved by bilinear interpolation. We normalize all pixel values to [0,1], and use $\sigma_{s}\in[3,15]$ and $\sigma_{r}\in[0.02,0.09]$ to produce all results in the paper. We find that $\sigma_{s}=5$ and $\sigma_{r}=0.07$ are good starting parameters for most images. As shown in Figure 8, large $\sigma_{s}$ and $\sigma_{r}$ help increase the effectiveness in removing large-scale and high-contrast textures, respectively. Note, different from standard joint bilateral filtering in which the range smoothing weight is independently computed for each RGB color channel, a three-channel shared range weight based on the Euclidean distance between two RGB color vectors is adopted in our method for enhancing structure preservation (see the supplementary material for validation). Parameters for all images in the paper are given in the supplementary material.

5. More Analysis

Effect of different Laplacian pyramid usages

In contrast to the operation in Eq. (3), another possible Laplacian pyramid usage is to firstly smooth out the unwanted texture details from $L_{k-1}$ by performing joint bilateral filtering using the initial upsampling output $\hat{R}_{k-1}$ generated from Eq. (2) as guidance, and then add the filtering result back to $\hat{R}_{k-1}$ for producing $R_{k-1}$ , i.e., $R_{k-1}=\hat{R}_{k-1}+\textrm{JBF}(L_{k-1},\hat{R}_{k-1})$ . As shown in Figure 4(e), this method works well for structure preservation, but is not robust enough for texture removal and may produce result with a small amount of texture residuals.

Difference from prior scale-aware methods

Unlike previous scale-aware texture smoothing methods (Zhang et al., 2014; Du et al., 2016; Jeon et al., 2016) which basically iterate between Gaussian smoothing and joint bilateral filtering in an alternating order at a fixed scale, our method is built upon multi-scale representations in the form of image pyramids and achieves texture smoothing by iteratively upsampling the coarsest level in Gaussian pyramid under the guidance of other levels in both Gaussian and Laplacian pyramids. As shown in Figure 6, our method clearly outperforms these methods in texture removal and structure preservation. Please see the supplementary material for more comparison results.

Necessity of image pyramids

To verify the necessity of image pyramids, we compare our approach with a variant that replaces Gaussian and Laplacian pyramids with a sequence of iteratively Gaussian blurred images (same to the number of coarse levels in Gaussian pyramid) of the original image and a series of difference images between successive Gaussian blurred images, and accordingly modifies Eq. (2) as joint bilateral filtering without upsampling. As shown in Figure 7(b), the variant produces result with blurred and distorted structures due to the structure degradation caused by Gaussian smoothing. In contrast, our method achieves result with sharp structures, as the same structure degradation issue arising from Gaussian smoothing in pyramid construction can be effectively alleviated by image downsampling.

Effect of different pyramid depths

Figure 9 shows how the depth of pyramid affects the results. As can be seen, deeper pyramid with smaller coarsest level helps produce results with stronger texture removal effect, especially for large-scale textures. The reason is that smaller coarsest level summarizes image information at a larger scale, where textures are more likely to be eliminated due to scale differences. However, this trend becomes less obvious when the coarsest level is smaller than $40\times 40$ , since our method is able to remove all the textures based on a pyramid with $40\times 40$ coarsest level. The above experiment indicates that deeper pyramid can be set for large-scale textures, while relatively shallower pyramid can be used for small-scale textures. As introduced in Section 4.2, our experience is that a coarsest level with at least 32 but less than 64 pixels on its long axis is generally sufficient for most images.

Effect of multiple smoothing

We show in Figure 10 that multiple smoothing with the same parameters barely changes the result anymore when single smoothing can remove all the textures, which further verifies our effectiveness in structure preservation.

Effect of noise

As introduced in (Zontak et al., 2013), the noise level of an image will be gradually reduced as the image scale gets coarser. Our method, which performs texture smoothing by upsampling the coarsest Gaussian pyramid level that is almost noise-free, is thus highly robust to noise, as demonstrated in Figure 11. Please see the supplementary material for more results on noisy images.

Time performance

As the major computation of our algorithm lies in iteratively performing joint bilateral filtering on several low-resolution pyramid levels instead of always operating at original full-resolution, it is relatively efficient compared to existing texture smoothing methods. On a 2.66GHz Intel Core i7 CPU, our unoptimized Matlab implementation that utilizes kernel separation takes about 1 second to process a $1280\times 720$ image. Thanks to the parallelism of bilateral filtering, our GPU implementation on a NVIDIA GeForce RTX 3090Ti GPU achieves about $200\times$ speed-up relative to the Matlab implementation, reducing the time cost to 5 milliseconds for the same $1280\times 720$ image.

6. Results and Applications

Comparison with state-of-the-art methods

We compare our method with the state-of-the-art texture smoothing methods (Xu et al., 2012; Karacan et al., 2013; Cho et al., 2014; Fan et al., 2018) in Figure 12. Note that although (Fan et al., 2018) is not specially designed for texture smoothing, it is compared because of its good performance on texture removal. For fair comparison, we produce results of the compared methods using publicly-available implementation or trained model provided by the authors with careful parameter tuning. From the results, we notice two key improvements of our method over the others. First of all, our method can effectively remove complex large-scale and high-contrast textures, such as the blocky texture of varying shapes and sizes in the first image and the dense scatter texture of random colors in the second image. Besides, we are able to avoid blurring or distorting image structures while removing textures. Please see the supplementary material for more comparison results and our comparison with more methods (Subr et al., 2009; Zhang et al., 2014, 2015; Ham et al., 2015).

Applications

Akin to previous work on texture smoothing (Xu et al., 2012; Karacan et al., 2013; Cho et al., 2014), our approach also enables a wide variety of image manipulation applications, including detail enhancement (Figure 13), image abstraction (Figure 14), HDR tone mapping (Figure 15), inverse halftoning (Figure 16), and LDR image enhancement (Figure 17). As the first four are common applications, here we describe only the implementation of LDR image enhancement. To enhance a poorly lit image $I$ , we first follow (Guo et al., 2016) to obtain an initial illumination map $L$ that records the maximum RGB color channel at each pixel. Next, we apply our method to $L$ for producing a smoothed illumination map $L_{s}$ , and then recover an enhanced image $I^{\prime}$ by $I^{\prime}=I/L_{s}^{\gamma}$ based on the Retinex theory, where $\gamma\in(0,1)$ is a enhancement control parameter set as 0.7. For all these applications, our method produces comparable or better results than the compared alternatives, which demonstrates the practicability of our method.

Additional results.

Figure 18 shows more results produced by our method, where the input images are diverse and contain a broad range of texture types, including: (i) an image with high-contrast line texture (1st column), (ii) an image with large-scale jigsaw texture (2nd column), (iii) an image with irregular tiled texture (3rd column), (iv) an image with complex knitted texture (4th column), and (v) an image with random texture composed of tiny portraits of varying tones and brightness (5th column). As can be seen, our method produces good results for these images, manifesting its effectiveness and robustness in texture smoothing.

7. Conclusion

We have presented a new technique for texture smoothing based on standard image pyramids. It is simple, fast and easy to implement, while allowing for surprisingly effective and robust texture removal. In contrast to previous methods, we open up a new perspective to separate texture from structure in the scale space characterized by image pyramids, without relying on any explicit texture-structure separation measures. We believe our work will have a broad impact on image filtering, and will also shed light on the use of pyramids in the domain of image editing and its related applications.

Limitations and future work

Although our method can effectively remove large-scale textures, it may result in loss of small-scale structures that are untraceable in the coarsest Gaussian pyramid level. Figure 19 shows an example where our method fails to retain the small-scale structures surrounded by the rhino horns. Reducing the pyramid depth and the value of $\sigma_{s}$ can help avoid the issue, but may lead to incomplete texture removal. Note that, for small-scale structures that are not existed in the coarsest level but classified as part of large-scale structures due to image downsampling, our method can recover them with the aid of other levels in Gaussian and Laplacian pyramids, such as the eyelash in Figure 16 (see the close-ups). Besides, our method inherits the limitation of bilateral filtering and may cause gradient reversal artifacts when a given image is over-smoothed (see Figure 13). In the future, we will focus on addressing these limitations. Another promising future work is to extend our method to video texture filtering.

Acknowledgements.

We would like to thank the anonymous reviewers for their insightful comments and constructive suggestions. This work is supported by the National Natural Science Foundation of China (U21A20471, 62072191), Guangdong Basic and Applied Basic Research Foundation (2023A1515030002, 2023B1515040025).

References

(1)
Bao et al. (2013) Linchao Bao, Yibing Song, Qingxiong Yang, Hao Yuan, and Gang Wang. 2013. Tree filtering: Efficient structure-preserving smoothing with a minimum spanning tree. IEEE Transactions on Image Processing 23, 2 (2013), 555–569.
Bi et al. (2015) Sai Bi, Xiaoguang Han, and Yizhou Yu. 2015. An L1 image transform for edge-preserving smoothing and scene-level intrinsic decomposition. ACM Transactions on Graphics 34, 4 (2015), 1–12.
Burt and Adelson (1983) Peter J Burt and Edward H Adelson. 1983. The Laplacian Pyramid as a Compact Image Code. IEEE Transactions on Communications 3, 4 (1983).
Chen et al. (2007) Jiawen Chen, Sylvain Paris, and Frédo Durand. 2007. Real-time edge-aware image processing with the bilateral grid. ACM Transactions on Graphics 26, 3 (2007), 103.
Cho et al. (2014) Hojin Cho, Hyunjoon Lee, Henry Kang, and Seungyong Lee. 2014. Bilateral texture filtering. ACM Transactions on Graphics 33, 4 (2014), 1–8.
Criminisi et al. (2010) Antonio Criminisi, Toby Sharp, Carsten Rother, and Patrick Pérez. 2010. Geodesic image and video editing. ACM Transactions on Graphics 29, 5 (2010), 134–1.
Du et al. (2016) Hui Du, Xiaogang Jin, and Philip J Willis. 2016. Two-level joint local laplacian texture filtering. The Visual Computer 32 (2016), 1537–1548.
Durand and Dorsey (2002) Frédo Durand and Julie Dorsey. 2002. Fast bilateral filtering for the display of high-dynamic-range images. ACM Transactions on Graphics 21, 3 (2002), 257–266.
Eisemann and Durand (2004) Elmar Eisemann and Frédo Durand. 2004. Flash photography enhancement via intrinsic relighting. ACM Transactions on Graphics 23, 3 (2004), 673–678.
Fan et al. (2018) Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, and Xin Tong. 2018. Image smoothing via unsupervised learning. ACM Transactions on Graphics 37, 6 (2018), 1–14.
Farbman et al. (2008) Zeev Farbman, Raanan Fattal, Dani Lischinski, and Richard Szeliski. 2008. Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Transactions on Graphics 27, 3 (2008), 1–10.
Fattal (2009) Raanan Fattal. 2009. Edge-avoiding wavelets and their applications. ACM Transactions on Graphics 28, 3 (2009), 1–10.
Gastal and Oliveira (2011) Eduardo SL Gastal and Manuel M Oliveira. 2011. Domain transform for edge-aware image and video processing. ACM Transactions on Graphics 30, 4 (2011), 1–12.
Gastal and Oliveira (2012) Eduardo SL Gastal and Manuel M Oliveira. 2012. Adaptive manifolds for real-time high-dimensional filtering. ACM Transactions on Graphics 31, 4 (2012), 1–13.
Gharbi et al. (2017) Michaël Gharbi, Jiawen Chen, Jonathan T Barron, Samuel W Hasinoff, and Frédo Durand. 2017. Deep bilateral learning for real-time image enhancement. ACM Transactions on Graphics 36, 4 (2017), 1–12.
Guo et al. (2016) Xiaojie Guo, Yu Li, and Haibin Ling. 2016. LIME: Low-light image enhancement via illumination map estimation. IEEE Transactions on Image Processing 26, 2 (2016), 982–993.
Ham et al. (2015) Bumsub Ham, Minsu Cho, and Jean Ponce. 2015. Robust image filtering using joint static and dynamic guidance. In CVPR. 4823–4831.
Jeon et al. (2016) Junho Jeon, Hyunjoon Lee, Henry Kang, and Seungyong Lee. 2016. Scale-aware structure-preserving texture filtering. Computer Graphics Forum 35, 7 (2016), 77–86.
Karacan et al. (2013) Levent Karacan, Erkut Erdem, and Aykut Erdem. 2013. Structure-preserving image smoothing via region covariances. ACM Transactions on Graphics 32, 6 (2013), 1–11.
Kass and Solomon (2010) Michael Kass and Justin Solomon. 2010. Smoothed local histogram filters. ACM Transactions on Graphics 29, 4 (2010), 1–10.
Kim et al. (2018) Youngjung Kim, Bumsub Ham, Minh N Do, and Kwanghoon Sohn. 2018. Structure-texture image decomposition using deep variational priors. IEEE Transactions on Image Processing 28, 6 (2018), 2692–2704.
Kopf et al. (2007) Johannes Kopf, Michael F Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint bilateral upsampling. ACM Transactions on Graphics 26, 3 (2007), 96.
Liu et al. (2016) Sifei Liu, Jinshan Pan, and Ming-Hsuan Yang. 2016. Learning recursive filters for low-level vision via a hybrid neural network. In ECCV. 560–576.
Liu et al. (2017) Wei Liu, Xiaogang Chen, Chuanhua Shen, Zhi Liu, and Jie Yang. 2017. Semi-global weighted least squares in image filtering. In ICCV. 5861–5869.
Liu et al. (2020) Wei Liu, Pingping Zhang, Xiaolin Huang, Jie Yang, Chunhua Shen, and Ian Reid. 2020. Real-time image smoothing via iterative least squares. ACM Transactions on Graphics 39, 3 (2020), 1–24.
Liu et al. (2021) Wei Liu, Pingping Zhang, Yinjie Lei, Xiaolin Huang, Jie Yang, and Michael Kwok-Po Ng. 2021. A generalized framework for edge-preserving and structure-preserving image smoothing. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021).
Lu et al. (2018) Kaiyue Lu, Shaodi You, and Nick Barnes. 2018. Deep texture and structure aware filtering network for image smoothing. In ECCV. 217–233.
Min et al. (2014) Dongbo Min, Sunghwan Choi, Jiangbo Lu, Bumsub Ham, Kwanghoon Sohn, and Minh N Do. 2014. Fast global image smoothing based on weighted least squares. IEEE Transactions on Image Processing 23, 12 (2014), 5638–5653.
Paris and Durand (2006) Sylvain Paris and Frédo Durand. 2006. A fast approximation of the bilateral filter using a signal processing approach. In ECCV. 568–580.
Paris et al. (2011) Sylvain Paris, Samuel W Hasinoff, and Jan Kautz. 2011. Local laplacian filters: edge-aware image processing with a laplacian pyramid. ACM Transactions on Graphics 30, 4 (2011), 68.
Perona and Malik (1990) Pietro Perona and Jitendra Malik. 1990. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 12, 7 (1990), 629–639.
Petschnigg et al. (2004) Georg Petschnigg, Richard Szeliski, Maneesh Agrawala, Michael Cohen, Hugues Hoppe, and Kentaro Toyama. 2004. Digital photography with flash and no-flash image pairs. ACM Transactions on Graphics 23, 3 (2004), 664–672.
Subr et al. (2009) Kartic Subr, Cyril Soler, and Frédo Durand. 2009. Edge-preserving multiscale image decomposition based on local extrema. ACM Transactions on Graphics 28, 5 (2009), 1–9.
Tomasi and Manduchi (1998) Carlo Tomasi and Roberto Manduchi. 1998. Bilateral Filtering for Gray and Color Images. In ICCV. 839–839.
Wei et al. (2018) Xing Wei, Qingxiong Yang, and Yihong Gong. 2018. Joint contour filtering. International Journal of Computer Vision 126, 11 (2018), 1245–1265.
Weiss (2006) Ben Weiss. 2006. Fast median and bilateral filtering. ACM Transactions on Graphics 25, 3 (2006), 519–526.
Xu et al. (2011) Li Xu, Cewu Lu, Yi Xu, and Jiaya Jia. 2011. Image smoothing via L0 gradient minimization. ACM Transactions on Graphics 30, 6 (2011), 1–12.
Xu et al. (2015) Li Xu, Jimmy Ren, Qiong Yan, Renjie Liao, and Jiaya Jia. 2015. Deep edge-aware filters. In ICML. 1669–1678.
Xu et al. (2012) Li Xu, Qiong Yan, Yang Xia, and Jiaya Jia. 2012. Structure extraction from texture via relative total variation. ACM Transactions on Graphics 31, 6 (2012), 1–10.
Zhang et al. (2015) Feihu Zhang, Longquan Dai, Shiming Xiang, and Xiaopeng Zhang. 2015. Segment graph based image filtering: fast structure-preserving smoothing. In ICCV. 361–369.
Zhang et al. (2014) Qi Zhang, Xiaoyong Shen, Li Xu, and Jiaya Jia. 2014. Rolling guidance filter. In ECCV. 815–830.
Zhu et al. (2016) Lei Zhu, Chi-Wing Fu, Yueming Jin, Mingqiang Wei, Jing Qin, and Pheng-Ann Heng. 2016. Non-Local Sparse and Low-Rank Regularization for Structure-Preserving Image Smoothing. Computer Graphics Forum 35, 7 (2016), 217–226.
Zontak et al. (2013) Maria Zontak, Inbar Mosseri, and Michal Irani. 2013. Separating signal from noise using patch recurrence across scales. In CVPR. 1195–1202.