This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Examining Interplay of Compression and Encryption and Applicability to 5G Teleoperations

Duncan Joly [email protected] 0000-0003-3058-0758 University of MinnesotaMinneapolisMN Jason Carpenter [email protected] University of MinnesotaMinneapolisMN  and  Zhi-Li Zhang [email protected] University of MinnesotaMinneapolisMN
(Date: University of Minnesota
Minneapolis, MN
)
Abstract.

Modern IoT and networked systems rely on fast and secure delivery of time-critical information. Use cases such as teleoperations require fast data delivery over mobile networks, which despite improvements in 5G are still quite constrained. Algorithms for encryption and compression provide security and data size efficiency, but come with time and data size trade-offs. The impact of these trade-offs is related to the order in which these operations are applied, and as such necessitates a robust exploration from a performance perspective. In this paper, we assess several compression and encryption algorithms, combinations of their execution order, timings and size changes from such order, and the implications of such changes on 5G teleoperations. From our assessments we have three major takeaways: (1) Compression-First is faster and more compressed, except for certain circumstances. (2) In these specific circumstances, the compression against a raw file leads to a lengthier time than if applied to an encrypted file first. (3) Applying both encryption and compression on data samples larger than 10MB is impractical for real time transmission due to the incurred delay.

compression, encryption, encrypt-first, compress-first, encryption-first, compression-first, teleoperations
copyright: nonedoi: isbn: conference: ; journalyear: ;price:

1. Introduction

Large scale networking and device connectivity support the modern world. From phones to pacemakers, devices must communicate critical information over networks quickly, safely, and privately. With the increasing integration of digital components into the economy comes a rise in the amount of cybercrime committed. Malicious actors aim to take control of devices, eavesdrop on data, and in some cases, cause physical harm. With many new devices coming online(Alam, 2018), this concern is quite substantial. Further, these devices often use mobile networks, such as is the case with many IoT devices and teleoperation suites. These networks often have limited resources that must be allocated between the connected devices, leading to each device receiving restrained service quality.

Protecting the data of these devices while also staying within the restrictions of constrained networks involves the use of compression and encryption algorithms. Generally speaking, encryption algorithms secure data with the possibility of increasing its size, and compression algorithms reduce the size of data. These algorithms are often paired together to achieve some security and compression capacity. These algorithms take time to operate, and thus may complicate real-time use cases such as teleoperations. Further, the timing and size impacts of these algorithms change based on the order in which they are applied to a given piece of digital information. The current understanding is that unless there are specific attacks to guard against such as CRIME(att, [n. d.]), one should compress first (CF) rather than encrypt first (EF) to achieve the best compression without compromising security(IBM, 2023; Kumar and Gandhi, 2012; Singh, [n. d.]; Rossevelt, 2023; Fleischhacker et al., 2022). To further resolve this, newer algorithms aim to combine compression and encryption to achieve more harmonious results(Carpentieri, 2018; Kumar and Gandhi, 2012; Chen et al., 2021). While there is substantial support for CF, there is not a robust exploration of the specific performance impacts of the relative order.

1.1. Contributions

In this paper, we assess the performance impacts of the order of application for various common compression and encryption algorithms with respect to time and data size changes. We then apply these findings to a time- and size-sensitive use case: 5G teleoperations. Broadly our findings can be summarized as follows:

\bullet We confirm that Compression-First (CF) is overall more performant than Encryption-First (EF) with the distribution of tests showing a broadly faster operation time and a compression ratio improvement generally between 25-50%.

\bullet However, we find that for file sizes larger than 10MB and some certain algorithms (bzip and Fernet) the faster-performing combination is actually Encryption-First with a 40-50% time improvement.

\bullet Encryption-First, under certain algorithm combinations, can out-perform Compression-First, such as with specific algorithms like bzip, Fernet, NaCL, and gzip regardless of file size.

\bullet When considering a teleoperations use case, we find that 73% of the operation pairs applied to file sizes smaller than 1MB had a total operation time of less than 100ms and thus suitable for real-time operation. Further, we find that for file sizes larger than 10MB the the operation time always exceeded the 100ms delay and thus unsuitable for real-time operation.

The code and data from this project are available at:

https://github.com/duncanjoly13/encoding-compres
sion-investigation
.

2. Background And Related Work

In this section we will cover some background concepts and related work for understanding the interactions for compression and encryption. The essential metrics this paper will cover are data size (expressed as file sizes), compression ratio (relative size of data after operations compared to original), and operation time (time taken to compress, encrypt, decrypt, and decompress). However, a core security metric, entropy, is important for measuring the ”disorder or randomness in a closed system”(NIST, [n. d.]). In other words, entropy represents how encrypted data and/or how compressible it is(Veytsman, 2016). Broadly, compression uses patterns for its purposes, and encryption attempts to break patterns(Veytsman, 2016). We note entropy as an important aspect of compression and encryption, but we save an exploration of entropy and operation order for future work.

Broadly, there are two orders to apply compression and encryption: Compression-First (CF) and Encryption-First (EF). CF is the most common approach and is understood to be high performing while also retaining high entropy(Carpentieri, 2018; enc, [n. d.]; Singh, [n. d.]; Rossevelt, 2023). EF is not common, but may be employed to avoid certain attacks that take advantage of the size of encrypted messages such as CRIME(att, [n. d.]; Veytsman, 2016; enc, [n. d.]). Generally speaking, encrypted data that is easily compressed is not well encrypted(IBM, 2023), and alongside the other issues it may present, is also vulnerable to differential cryptanalysis(Chen et al., 2021). Due to the pressure of requiring security while maintaining reasonable data sizes, newer algorithms have emerged to integrate the two operations into a cohesive algorithm(Carpentieri, 2018; Kumar and Gandhi, 2012; Chen et al., 2021). Our work focuses on a set of common individual algorithms and saves the cutting edge algorithms for future work.

3. Problem Formulation

Refer to caption
Figure 1. Compression and Encryption Pipeline.

To formally evaluate the timing and file size impacts of a given file’s compression and encryption life cycle, we outline a process timeline for operations (encryption or compression) as illustrated in Figure 1. From this process, we highlight three high level metrics: (1) Operation and Operation inverse 1&2 times, which measures the time for a discrete compress/decompress and encrypt/decrypt. (2) Total time, which is the sum total of all operations and inverses for a particular algorithm grouping. Finally, (3) intermediate and final file sizes which provide us insights into the operation-specific impacts and overall impact on the targeted data file.

Algorithm Name Type
bzip2 (bz2, bzip)(bzi, [n. d.]) compress
gzip(gzi, [n. d.]) compress
lzma(lzm, [n. d.]) compress
zipfile (zip)(zip, [n. d.]) compress
AES(AES, [n. d.]) encrypt
Fernet(Fer, [n. d.]) encrypt
NaCl (XSalsa20 (NaC, [n. d.]; XSa, [n. d.]; Bernstein, 2012) with 192bit nonce) encrypt
Table 1. Evaluated Algorithms.
File Size (B) File Type Data Origin
85 CSV Vehicle GPS Sample
174 CSV Novatel GPS Unit
362 CSV Novatel IMU
451 CSV Novatel enhanced GPS
520 bytes Ouster LiDAR Telemetry Packet
564 CSV Novatel Odometry
1206 bytes Ouster LiDAR Data Packet
5052 CSV MobileEye Lane Marker Sample
1086844 text 1MB of enwik8
10239975 text 10MB of enwik8
11081517 PDF testing PDF
101128023 text 95MB of enwik8
Table 2. Evaluated files: A selection of files across several domains, data formats, and compositions.

4. Methodology and Testing

We examine the effect that changing the order of compression and encryption has on the file size and operation time. Using the process outlined in Fig. 1, we measure the time elapsed at each step of a file being encrypted and compressed, written to disk, and subsequently decrypted and decompressed. We vary the order of compression and encryption. We select 4 common compression algorithms and 3 encryption algorithms (outlined in Table 1 along with 12 different file sizes (Table 2). The files are taken from a selection of vehicle sensors such as LiDAR and GPS unit samples, public large and small text samples such as enwik8(enw, [n. d.]), and a free testing PDF. Tests on files larger than or equal to 10MB are repeated 32 times whereas tests on files smaller than 10MB are repeated 100 times. These repetitions, totaling 39840 tests, alleviate natural variation in operation times and provide a clean average for each combination.

Refer to caption
Refer to caption
Figure 2. Operation timings: note the overall expected timings for CF and EF.
Refer to caption
Refer to caption
Figure 3. Operation timings by algorithm. We note the cases where EF is faster overall. We also note that the left side (EF) has competitive performance with the right side group (CF).

5. Results Examination

In this section we review the results of our battery of compression and encryption operations against our test files. We first consider the timing impacts of our operations and then consider the impact order has on file size.

5.1. Timing Impacts

Refer to caption\cprotect
Figure 4. Timing breakdowns of the bzip—/Fernet— and gzip—/NaCl— pairs, showing a case of Encryption-First operating faster (approx. 50% reduction for Fernet-then-bzip— and a smaller approx. 5% change in NaCl-then-gzip—).

Overall, we see that CF is on average slightly faster than EF; however the overall distribution of CF algorithm pairs does better than the general distribution of EF algorithm pairs. Interestingly, the same trend is somewhat complicated when considering file sizes larger than 10MB, where we see the average CF operation time take roughly 2000ms more. These results reinforce the broad consensus that CF is better in terms of performance.

Digging in deeper, we examine specific algorithm pairs split by file size in Fig. 3 with the left side algorithms as CF and the right side ones as EF. First, we see that for file sizes smaller than 10MB the overall distribution is tight around 5-6ms with the averages dominated by outliers (excluded from the figures for clarity). For file sizes larger than or equal to 10MB, we see the distributions are dominated by a handful (2-3) slower-performing algorithm pairs. If these algorithms are omitted the distributions would roughly center around 4000-5000ms.

Further, we see the specific pairs bzip/Fernet and
gzip/NaCL actually perform faster when encryption is applied first as opposed to Compression-First. In Fig. 4, we see that the CF bzip-then-Fernet sees an almost 45-50% increase in compression time when compressing first. This result may imply that the encryption with Fernet decreases entropy on the file, thus making the compression easier, however this is pure speculation as this result is unintuitive otherwise. For gzip-then-NaCl we see an increase of roughly 30-40ms in files of size 1MB (about double the time), and in a 10MB file this doubling becomes an increase of around 300ms. This, along with a few other algorithm pairs, leads us to conclude that in relatively few or rare cases the operation order with the fastest time can be EF.

5.2. File Size Impacts

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 5. Compression ratios across several file sizes. We find that the smaller files have an overall worse compression ratio for both EF and CF approaches.

When considering the impact on data sizes, our results in Fig. 5 also confirm CF’s dominance. CF is able to achieve compression ratios that are below the original file size (indicated by the red line), and with significant reduction of 25-50% at least when compared to EF. There is not context where EF produced final data size that was below the original size. This is intuitive as encryption would increase the size of data in most cases, and if done well compression may very well struggle as indicated in the related work section.

6. Impact on 5G Teleoperations

From our assessments of algorithm time and size performance, we now apply these observations to a practical use case: teleoperations over 5G.

For real-time transmission, the sum of delay from file read to file reception must be less than 100ms (0.1 seconds)(Nielsen, 1993). For an example with a 10MB PDF file, we add a network latency variable between Operation 2 and Operation 2 Inverse (visualized in Figure 4) to approximate network transmission time. When considering the fastest algorithm for file sizes <<1MB, we observed files are encrypted, compressed, transmitted with a theoretical and static network latency of 50ms, decrypted, and decompressed in less than 100ms 100% of time. In files with size 1MB, real-time transmission is demonstrated in 73% of samples with the fastest algorithm pair. Lastly, in files \geq10MB, the fastest algorithm pair achieves real-time transmission 0% of the time.

Broadly, this means that for larger size data transmissions, the algorithms may not be quite performant enough to apply encryption and compression together, requiring increases in capacity or intelligent decision systems.

7. Limitations and Future Work

This work can be considered a light but robust exploration of the interplay of compression and encryption. Further work can expand on the existing dimensions, including more algorithms and data types. Additionally, examining in detail the impact interplay has on the entropy and other security considerations may be examined as well.

8. Conclusions

In this paper, we conducted an assessment of the interplay of compression and encryption and its implications on 5G teleoperations. From these examinations, we confirmed that CF is generally the better performing operation order for compression and encryption, but that in certain algorithmic pairs the inverse may be true. Additionally, the impact of using both encryption and compression for files that are sufficiently large will make them unsuitable for real-time transmission, dampening the possibilities for 5G-teleoperations using secure and efficient transmission.

References