RED – Robust Environmental Design
Abstract
The classification of road signs by autonomous systems, especially those reliant on visual inputs, is highly susceptible to adversarial attacks. Traditional approaches to mitigating such vulnerabilities have focused on enhancing the robustness of classification models. In contrast, this paper adopts a fundamentally different strategy aimed at increasing robustness through the redesign of road signs themselves. We propose an attacker-agnostic learning scheme to automatically design road signs that are robust to a wide array of patch-based attacks. Empirical tests conducted in both digital and physical environments demonstrate that our approach significantly reduces vulnerability to patch attacks, outperforming existing techniques.
1 Introduction
As autonomous driving systems become progressively more embedded in real-world systems, their safety becomes paramount. These systems and their sub-components, such as classification and segmentation modules, have been shown to be vulnerable to adversarial attacks Goodfellow et al. [2014], Madry et al. [2017], Kurakin et al. [2016] In this work, we focus on enhancing the safety of such systems by modifying the appearance of objects (specifically road signs) such that adversarial attacks applied to those objects are less effective (see Figure 1).

Salman et al. [2021] first propose modifying the appearance of physical objects by designing patterns that make them easier to recognize under naturally challenging conditions, e.g., foggy weather. These conditions occur naturally rather than being the result of an adversarial attack. Adversarially crafted perturbations pose a more significant challenge from a defender’s perspective for two key reasons: firstly, adversarial examples are explicitly designed to decrease model performance, and secondly, they are out of distribution with respect to training data (naturally challenging conditions are typically seen in training data, albeit scarcely for some domains).
In the context of autonomous driving, Eykholt et al. [2018], Yang et al. [2020] demonstrated the practical dangers of misclassification, such as small errors leading to severe accidents. Giving greater concern is the observation that these adversarial attacks can be physically realized Eykholt et al. [2018], Kurakin et al. [2016], Athalye et al. [2018]. Physically realizable attacks often take the form of adversarial patches, small regions of an image designed to deceive classifiers, detectors, segmenators. Brown et al. [2017], Eykholt et al. [2018], Liu et al. [2018], Karmon et al. [2018], Zhang et al. [2019] introduced physical adversarial patches for real-world objects.
Defenses can be categorized into Attack-Aware and Attack-Agnostic defenses. Attack-aware defenses, such as adversarial training, (Goodfellow et al. [2014], Madry et al. [2017], Shafahi et al. [2019]), rely on knowledge of specific attacks. In contrast, Attack-Agnostic defenses, such as randomized smoothing (Cohen et al. [2019], Lecuyer et al. [2019], Salman et al. [2019]) and image sanitization techniques (Xiang et al. [2020, 2022], Xu et al. [2023]), do not rely on specific attack information.
To counter adversarial attacks, we propose an environmental-centric approach, Robust Environmental Design (RED), in which we design the backgrounds of road signs such that the road signs are both robust and still easy to print (as shown in Figure 1). Our defense procedure and objective function learn object patterns so that once these road signs are deployed in the physical world, they achieve robustness against patch attacks without requiring adversarial training. The model is trained only on clean data, and at test time, even some naive image sanitizing defense ensures can accurately predict even on maliciously modified inputs , eliminating the need for adversarial training after deployment. To demonstrate the efficacy of our method, we conduct experiments using two common benchmark datasets for road sign classification, LISA and GTSRB Eykholt et al. [2018], and test against several types of patch-based attack paradigms. Our approach achieves high levels of robustness compared to baseline models. Additionally, we conducted physical experiments by printing various common road signs (e.g., stop signs, speed limit signs, etc.) with patterns optimized via RED. We collected photos at different times of the day, under various lighting and weather conditions. We find that RED significantly improves robustness against attacks in both digital and physical settings.
2 Preliminaries
Road Sign Classification Let be a domain of possible road sign images of by pixels, and be the set of possible road sign classes. Each road sign image has a corresponding true label (e.g., stop sign). To predict the class of a road sign, a classifier is used, where represents the predicted class of image .
Adversarial Patch Attack We focus on patch-based attacks against image classification models. For a given image with true label , the attacker’s goal is to find a maliciously modified version of , say such that . Patch-based attacks constitute the attacker modifying a region of at most pixels in . The region can have various shapes but is constrained to be a contiguous region of the image, and is defined by a binary mask where if the attacker is modifying pixel and . The attacker then applies a perturbation (with magnitude at most ) to the pixels defined by . The attacker finds their desired mask and perturbation via the following:
(1) | ||||
s.t. | ||||
Where counts the number of 1’s in the mask and is elementwise multiplication.
3 RED-Robust Environmental Design
Next, we outline our proposed method for robust environmental design (RED). RED works by crafting the background of a road sign such that any patch placed on that road sign is not effective at fooling the classifier (see Figure 1 for an example). RED has two key phases: a pattern selection phase in which we select the pattern for a given road sign class and a training phase in which we train our classifier on these newly selected patterns. The result of RED is a classifier that is robust to a wide array of patch-based attacks (without the need to simulate those attacks directly).
The key insight to our pattern selection is that road signs are manufactured objects, and their true label is known at manufacture time. Thus, we will seek to modify the road signs at manufacture time to contain a high level of class specific information, making them easier to detect and, more importantly, harder to attack. More formally, for each class , let represent the pattern on road signs of class (e.g., when is the class stop signs, the current design of is a red background).
We employ image ablation defense at inference time, with the key advantage that no adversarial training is required to detect the adversarial patch. An ablation algorithm g masks image X, leaving only a subregion of unmasked pixels (see Figure 4). Several works (Xiang et al. [2020, 2021], Levine and Feizi [2020]) have explored this approach, differing mainly in ablation size and strategies for removing adversarial patches when specific attack information is available. Most of these defenses, however, are effective only for small patches, as they remove only a small portion of the image; when the patch covers more than 10%, as shown in Xiang et al. [2020, 2021], they tend to fail, limiting their effectiveness as a universal defense. In contrast, Levine and Feizi [2020] propose a technique that defends against patches of various sizes by applying a simple ablation technique that removes most of the image, preserving only a small portion for inference to avoid the patch. In this paper, we empirically demonstrate the robustness of redesigned road signs using this generic method Levine and Feizi [2020] to defend against both small and large adversarial patches. However, our road signs are not tied to any specific defense. In the extended version of this paper, we will present the effectiveness of these redesigned signs using various image sanitizing techniques like Xiang et al. [2020] and Xiang et al. [2022].
At inference time, we apply several different ablation algorithms , to image , producing masked images . The classifier then makes a prediction on each ablated image, and the final prediction is obtained via majority vote:
RED selects such that does contain enough class-specific information via Algorithm 1.
Training Given any image ablation algorithm , a dataset with examples, and classes of road signs, our goal is to find a robust background for each class . We use gradient methods to find ; we parameterize the background of each sign to be a colored grid (see Figure 1) where gives the color of each element of the grid. We then minimize the classification loss with respect to color parameters , i.e.,
(2) |
Figure 4 of the Appendix illustrates the training process.
4 Experiments
Datasets and Road Sign Classification Models
We conduct experiments on GTSRB and LISA datasets used in Eykholt et al. [2018]. GTSRB includes thousands of traffic signs across 43 categories of German road signs, while LISA contains 16 types of US road signs. Following Eykholt et al. [2018], we used a simple CNN style with ReLU networks as the classification models for classifying each road sign dataset. For patch attacks, we utilize the Sticker-Attack defined by Eykholt et al. [2018] and the Patch-Attack introduced by Brown et al. [2017].
Digital Redesigned Road Sign: to simulate the deployment of redesigned road signs, we applied spatial and color transformations to mimic real-world conditions, such as varying lighting and camera capture capabilities. For color changes, we sampled 22 contrast colors, printed them as patches, and photographed them under different lighting, distances, and times of day. For spatial transformations, we used bounding box annotations to learn homography matrices, simulating spatial changes in camera capture. Given a top-down redesigned road sign, We then simulate the deployment of the redesign by applying color and spatial transformations to the selected pattern, ensuring it fits onto the road sign. These digitally collected road signs will be referred to as digital-RED-LISA and digital-RED-GTSRB.
Physical Redesigned Road Signs: we printed the redesigned road signs and captured photos at various times of day, from different distances, and under varying lighting conditions between the road and the camera. Further details on the physical experiment setup are deferred to the corresponding section. We will refer to these physically collected datasets as physical-RED-LISA and physical-RED-GTSRB.
Robustness
We are interested in defenses where the defender does not have access to the specifics of the attack, and the only prior knowledge is that the attack takes the form of a patch. In such defenses, the standard approach is to sanitize the image before feeding it into the model Levine and Feizi [2020]. In this short version paper, we use (De)Randomized defense for showing the robustness of RED designed signs. In Table 1 , we show the performance under different attacks. (De)Randomized improves the model’s performance on adversarial examples compared to the vulnerability seen when the entire image is used as input. However, even with a patch attack covering only 10% of the image, the efficacy of the road sign classification model drops from 82% to 75%. Moreover, (De)Randomized exhibits decreased accuracy on clean data (compared to using the full image). We observe that our method obtains significantly greater accuracy on both clean and attacked data compared to patch-smoothing.
Method | Clean | Sticker Attack | Patch Attack (10%) | Patch Attack (30%) |
---|---|---|---|---|
LISA | ||||
No Defense | 99% | 10% | 5% | 10% |
(De)Randomized | 82% | 88% | 75% | 63% |
RED (ours) | 99% | 99% | 99% | 95% |
GTSRB | ||||
No Defense | 99% | 20% | 15% | 2% |
(De)Randomized | 84% | 75% | 91% | 87% |
RED (ours) | 99% | 99% | 98% | 99% |
A key question we address is How to ensure a small patch contains enough information to accurately infer the class of an image. We propose using a colorful grid as the background for road signs and conducted ablation analysis on grid size. In Table 2 we show classification accuracy under different grid sizes for road sign background; S3, S5, S10 represent 3x3, 5x5, and 10x10 grid sizes (see Figure 5). Note that the 1x1 grid is equivalent to the current road sign design. These results indicate that the current road sign designs do not always allow small patches to reliably classify the signs, but as we increase the grid size, even small defense mask sizes result in high accuracy (e.g., 90% accuracy with mask size of only 13%).

This approach relies on ensuring that each sampled sign patch contains sufficient class-specific information to support independent inference of the road sign’s class.
Defense Mask Size | GTSRB | GTSRB-S3 | GTSRB-S5 | GTSRB-S10 | LISA | LISA-S3 | LISA-S5 | LISA-S10 |
---|---|---|---|---|---|---|---|---|
13% | 48% | 61% | 55% | 50% | 50% | 96% | 94% | 90% |
20% | 71% | 98% | 99% | 88% | 65% | 99% | 99% | 92% |
26% | 84% | 99% | 99% | 99% | 78% | 99% | 99% | 99% |
40% | 95% | 99% | 99% | 99% | 91% | 99% | 99% | 99% |
Evaluation Across Different Attacks
We evaluate our redesigned road signs against various attacks, including sticker attacks and patch attacks (PGD-inf) with different shapes: rectangles, triangles, and multi-patches. The multi-patch attack is specifically designed to target ablated defense, where the attacker uses multiple small patches to bypass the defense. As shown in Table 3, RED demonstrates strong performance against each variant of patch attack.
Physical Experiment
We physically recreated representative road signs for evaluation: two speed limit signs, a stop sign, and an arrow from LISA, and a stop sign, two speed limit signs, and a truck warning sign from GTSRB.
We show some examples in Figure 3. After finding redesigned road sign datasets for LISA and GTSRB via simulation, we printed them on 16x18 inch paper boards using a Sony Picture Station printer. The signs were then photographed in various real-world settings using a Nikon D7000, either handheld or mounted on a wood stick (see Figure 3). About 50 images of each sign were captured under diverse conditions, including different locations, weather, and times of day.

Table 3 shows that our design is significantly more robust than current road signs when deployed in the physical world. Moreover, even with a simplified sanitizing algorithm Levine and Feizi [2020], which uses only a small portion of the image for inference, the robust design—where each small area contains discriminative class information—achieves strong performance. The pattern produced by RED is resilient to both adversarial attacks and natural noise in the physical world.
Datasets | LISA | RED-LISA-Digital | RED-LISA-Physical | GTSRB | RED-GTSRB-Digital | RED-GTSRB-Physical |
Patch Size: 10% | ||||||
Rectangle | 75% | 99% | 99% | 91% | 99% | 98% |
Triangle | 75% | 99% | 99% | 91% | 98% | 99% |
Multi-Patches | 63% | 99% | 98% | 83% | 99% | 97% |
Patch Size: 30% | ||||||
Rectangle | 63% | 94% | 95% | 87% | 99% | 99% |
Triangle | 63% | 94% | 95% | 87% | 94% | 93% |
Multi-Patches | 59% | 93% | 95% | 85% | 93% | 94% |
Sticker | 88% | 99% | 99% | 75% | 99% | 99% |
5 Social Impact Statement
Our work on Robust Environmental Design (RED) enhances the robustness of visual recognition systems, particularly for self-driving cars, by improving the resilience of road signs against adversarial attacks. This contributes to the safety and reliability of autonomous navigation technologies as they integrate into society. The societal impact is significant, as our research reduces the risk of adversarial misclassifications, which could lead to traffic accidents or system failures. This directly improves public safety by strengthening the reliability of critical systems in urban and rural environments.
Our work addresses the challenge of ensuring AI systems are both performant and resistant to manipulation. By developing defenses against adversarial attacks, we contribute to more secure and fair AI technology deployment. With this in mind, we note that although RED significantly improves robustness, it requires the ability to edit objects (e.g., selecting the pattern applied to road signs at manufacture time). This may not be feasible for all objects (e.g., pedestrians, wild animals, plants, etc.).
References
- Goodfellow et al. [2014] Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2014.
- Madry et al. [2017] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Kurakin et al. [2016] Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world. In Workshop at the International Conference on Learning Representations, 2016.
- Salman et al. [2021] Hadi Salman, Andrew Ilyas, Logan Engstrom, Sai Vemprala, Aleksander Madry, and Ashish Kapoor. Unadversarial examples: Designing objects for robust vision. Advances in Neural Information Processing Systems, 34:15270–15284, 2021.
- Eykholt et al. [2018] Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
- Yang et al. [2020] Chenglin Yang, Adam Kortylewski, Cihang Xie, Yinzhi Cao, and Alan Yuille. Patchattack: A black-box texture-based attack with reinforcement learning. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI, pages 681–698. Springer, 2020.
- Athalye et al. [2018] Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International Conference on Machine Learning (ICML), pages 274–283. PMLR, 2018.
- Brown et al. [2017] Tom B Brown, Dandelion Mané, Aurko Roy, Martín Abadi, and Justin Gilmer. Adversarial patch. arXiv preprint arXiv:1712.09665, 2017.
- Liu et al. [2018] Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770, 2018.
- Karmon et al. [2018] Daniel Karmon, Daniel Zoran, and Yoav Goldberg. Lavan: Localized and visible adversarial noise. arXiv preprint arXiv:1801.02608, 2018.
- Zhang et al. [2019] Hang Zhang, Ingrid Daubechies, Tom Goldstein, and Christoph Studer. Robust patch attacks. arXiv preprint arXiv:1904.13053, 2019.
- Shafahi et al. [2019] Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Larry Davis, Gavin Taylor, and Tom Goldstein. Adversarial training for free! Advances in Neural Information Processing Systems (NeurIPS), 32, 2019.
- Cohen et al. [2019] Jeremy M Cohen, Elan Rosenfeld, and J Zico Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning (ICML), pages 1310–1320, 2019.
- Lecuyer et al. [2019] Mathias Lecuyer, Vasileios Atlidakis, Roxana Geambasu, Daniel Hsu, and Suman Jana. Certified robustness to adversarial examples with differential privacy. IEEE Symposium on Security and Privacy (SP), pages 656–672, 2019.
- Salman et al. [2019] Hadi Salman, Greg Yang, Pengchuan Li, Ilya Zhang, Huan Zhang, Cho-Jui Zhang, Sebastien Bubeck, and I-Jui Zhang. Provably robust deep learning via adversarially trained smoothed classifiers. In Advances in Neural Information Processing Systems (NeurIPS), volume 32, 2019.
- Xiang et al. [2020] Chong Xiang, Zhiyuan Xu, and Bo Li Zhu. Patchguard: A provably robust defense against adversarial patches via small receptive fields and masking. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 14565–14574, 2020.
- Xiang et al. [2022] Chong Xiang, Yang Zhang, Z Gu, and Bo Zhu. Patchcleanser: Certifiably robust defense against adversarial patches for any image classifier. arXiv preprint arXiv:2203.08488, 2022.
- Xu et al. [2023] Ke Xu, Yao Xiao, Zhaoheng Zheng, Kaijie Cai, and Ram Nevatia. Patchzero: Defending against adversarial patch attacks by detecting and zeroing the patch. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 4632–4641, 2023.
- Xiang et al. [2021] Chunyang Xiang, Austin R. Benson, Aleksander Mądry, Elan Rosenfeld, and Zico Kolter. Patchcleanser: Certifiably robust defense against adversarial patches for any image classifier. In Proceedings of the 38th International Conference on Machine Learning (ICML), pages 11260–11270, 2021. URL https://proceedings.mlr.press/v139/xiang21a.html.
- Levine and Feizi [2020] Alexander Levine and Soheil Feizi. (de) randomized smoothing for certifiable defense against patch attacks. Advances in Neural Information Processing Systems, 33:6465–6475, 2020.
Appendix
Appendix A Methodology
Enhance Class Information within a Road Sign
In practice, we observe that smaller patch regions are more effective (see Section 4 for a more thorough study of region size. Our findings across both the LISA and GTSRB datasets reveal that current sign designs typically require a relatively large visible area for effective inference. To address this issue, we propose redesigning road signs to enhance the informational content within small local areas, say small patches.
Without loss of generality, we consider an ablation function , which obscures most of the image while retaining only a small patch. Consequently, an ablated sample will contain just this small patch of the original image . This approach serves as a showcase for the robust road sign design.
We employ Algorithm 1 to optimize the design of robust road sign backgrounds. These backgrounds are engineered to enhance the class information within localized small areas. Consequently, as illustrated in Figure 2, every local area of the newly designed road signs contains essential class information. This redesigned strategy aims to ensure that even minimal patches can independently verify the sign’s class, i.e., .
When selecting, the set of ablation functions , both the region and ablation size are consequential. Other works which use albetion function (e.g., Xu et al. [2023]) suggest using a random size and location; in addition to one randomized abletion, we propose a majority vote-based algorithm to utilize for inference. We will show the empirical results for both methods in the next section.
Training
Each class has a pattern. For an image with label , the corresponding pattern is denoted as . This pattern is then combined with the road sign mask, which includes text and shape, using precomputed color and homography mapping. The resulting image is processed, and the loss is calculated using Equation 2. Finally, the gradients are backpropagated to update the parameters.

Next, we will demonstrate an ablation algorithm combined with our methods. We will discuss this in more detail.
A.1 Special Case: Attacker-Aware Robust Environmental Design (AA-RED)
Next, we look at how RED can be improved when the defender has knowledge of the attacks, and designs specific robust signs for robustness against given attacks ; the set of attacks is . Let be the robust pattern, it is label-specific, and each class has a robust pattern, let be the classification model, and let be the cross entropy loss:
That is when the attacker is known, the defender can simulate the attacker’s best response to the defender’s current choice of pattern and classifier . This is effectively a combination of adversarial training and RED. The full procedure for AA-RED is outlined in Algorithm 2
Method | Clean | Sticker Attack | Patch Attack (5%) | Patch Attack (10%) | Patch Attack (25%) | Patch Attack (30%) |
---|---|---|---|---|---|---|
LISA | ||||||
No Defense | 99% | 10% | 15% | 5% | 10% | 10% |
(De)Randomized | 82% | 88% | 81% | 75% | 67% | 63% |
RED | 99% | 99% | 99% | 99% | 96% | 95% |
GTSRB | ||||||
No Defense | 99% | 20% | 22% | 15% | 8% | 2% |
(De)Randomized | 96% | 75% | 92% | 91% | 90% | 87% |
RED | 99% | 99% | 99% | 98% | 98% | 99% |
Appendix B Experiments
In this section, we present RED designed LISA and GTSRB results against a wider range of attacks, which could not be included in the main body due to space constraints.
Datasets | LISA | RED-LISA-Digital | RED-LISA-Physical | GTSRB | RED-GTSRB-Digital | RED-GTSRB-Physical |
---|---|---|---|---|---|---|
Patch Size: 5% | ||||||
Rectangle | 81% | 99% | 99% | 93% | 99% | 99% |
Triangle | 80% | 99% | 99% | 92% | 99% | 98% |
Multi-Patches | 82% | 99% | 98% | 88% | 99% | 99% |
Patch Size: 10% | ||||||
Rectangle | 75% | 99% | 99% | 91% | 99% | 98% |
Triangle | 75% | 99% | 99% | 91% | 98% | 99% |
Multi-Patches | 63% | 99% | 98% | 83% | 99% | 97% |
Patch Size: 25% | ||||||
Rectangle | 67% | 97% | 96% | 90% | 99% | 98% |
Trianlge | 63% | 94% | 95% | 87% | 98% | 99% |
Multi-Patches | 59% | 95% | 96% | 71% | 98% | 98% |
Patch Size: 30% | ||||||
Rectangle | 63% | 94% | 95% | 87% | 99% | 99% |
Triangle | 63% | 94% | 95% | 87% | 94% | 93% |
Multi-Patches | 59% | 93% | 95% | 85% | 93% | 94% |
Sticker | 88% | 99% | 99% | 75% | 99% | 99% |
