This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Reimagining Application User Interface (UI) Design using Deep Learning Methods: Challenges and Opportunities

Subtain Malik [email protected] Muhammad Tariq Saeed [email protected] Marya Jabeen Zia [email protected] Shahzad Rasool [email protected] Liaquat Ali Khan [email protected] Mian Ilyas Ahmed [email protected] School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan Air University, Islamabad, Pakistan
Abstract

In this paper, we present a review of the recent work in deep learning methods for user interface design. The survey encompasses well known deep learning techniques (deep neural networks, convolutional neural networks, recurrent neural networks, autoencoders, and generative adversarial networks) and datasets widely used to design user interface applications. We highlight important problems and emerging research frontiers in this field. We believe that the use of deep learning for user interface design automation tasks could be one of the high potential fields for the advancement of the software development industry.

keywords:
Deep Learning , Human-Computer Interaction , User Interface Designs , Data-Driven Models

1 Introduction

Interactive systems have transitioned from being used in specialized environments to being ingrained into everyday life. All sorts of devices including ticketing machines, home appliances, kiosks, wearable devices, etc. have computing devices embedded in them. Thus, the design of interactive systems has evolved from creating ’standalone’ systems to designing whole environments where the context of system usage is an essential part of the design consideration. A number of factors contribute towards the success of an interactive system which is often evaluated in terms of usability, accessibility, learnability and memorability among others. The user interface (UI) design remains a critical part of the user experience (UX) design. The conventional process of UI design involves understanding aspirations of the user and studying activities in context of the application. It requires exploration and design of technological solutions that fit the users, the activities, and the context of their usage. The process also involves evaluation of design alternatives that are refined in an iterative manner. A range of skills from multiple disciplines are involved, and a single designer seldom carries all the skills needed for the design of such a solution.

The user centered approach to the design of interfaces allows for engineering design alternatives most likely to succeed. Nonetheless, much of the UI design process is still guided by best practices and principles that originate from various disciplines such as psychology, human factors, ergonomics, etc. As a result a number of established design guidelines have come into existence which significantly reduce the time, cost, and effort to arrive at a feasible technological solution and a successful UI design.

The idea of generative UI modelling can be traced back to pattern based models for UI design and code generation approaches. In the past, several contributions have been made by using various statistical and machine learning (ML) techniques for UI design. However, a major limitation of many ML algorithms is the requirement for hard-coded rules. For instance, while recognizing a cat or dog, a human infant learns the classes of objects by visual inspection and processing of many instances/images. The learning process of a human neural model is much more focused on the overall shape and structure rather than focusing on individual features such as a cat’s eyes, mouth, ears, and fur. The human brain’s core, which is made up of billions of neurons that are wired together, works by abstracting only the necessary data to achieve any task. Hinton et al. [1] proposed the idea of artificial neural networks that constitutes the foundation of Deep Learning (DL) methods.

Multiple artificial neurons are combined to formalize the hidden layers. These layers are connected to achieve a particular task, as given in Figure 1. The intertwining of layers constitutes a dense system of multi-tiered neural network and is central to the concept of deep learning. This method proves to be very useful towards achieving artificial intelligence closer to human learning than the previous ML techniques. DL has many benefits, but it also has some limitations such as it requires a lot of data and needs a lot of computation power to process that data. However, deep learning-based models can be more accurate in numerous fields. These models are now implemented in self-driving cars, weather forecasting systems, and detection of trends in the stock market. However, there is limited use of DL based models in the software design industry. There are open problems within this industry where deep learning algorithms can be applied to automate tasks such as functional software testing, design testing, code generation, and design generation. Here, we discuss fundamental concepts of supervised and self-supervised machine learning strategies. Supervised learning is the method in which labelled data is provided to a training algorithm. On the other hand, in self-supervised learning, the data is provided to the machine without any labels, and the model has to understand the hidden patterns inside the data.

2 Methods and data to review DL in UI Design synthesis

2.1 Study selection using meta-analysis

A title/abstract/keyword-based search was carried out in the Google Scholar database to find articles related to UI design automation through deep learning techniques. The search query included the “interface design” and “artificial intelligence” (search date: February 23, 2020) keywords, which returned 4,200 results. However, keeping in view the focus of our study on the deep learning-based UI automation systems, we replaced the “artificial intelligence” with “deep learning” in conjunction with “user interface” and “user interface design”. Consequently, from the query we obtained 374 articles relevant to our study.

2.2 Conferences and Journal Articles

Table 1 and 2 show titles and number of relevant papers from journals and conferences related to UI design automation and deep learning.

Table 1: Journals identified as relevant, and number of reviewed papers.
Name of Journal #
ACM Transactions on Graphics 42
International Journal of Human-Computer Studies 31
IEEE Transactions on Pattern Analysis and Machine Intelligence 26
IEEE Transactions on Software Engineering 11
International Journal of Human–Computer Interaction 7
ARXIV 4
ACM Transactions on Software Engineering and Methodology 2
Computer Graphics Forum 3
Frontiers of Information Technology & Electronic Engineering 3
Table 2: Conferences/workshops identified as relevant, and number of relevant papers.
Title of conference/workshop #
Conference on Human Factors in Computing Systems 43
International Conference on Software Engineering 31
Symposium on User Interface Software and Technology 22
Conference on Pervasive and Ubiquitous Computing 12
Conference on Computer Vision and Pattern Recognition Workshops 10
International Conference on Automated Software Engineering 8
International Conference on Document Analysis and Recognition 8

We further reduced the database to 100 articles on the basis of three attributes i.e., “deep learning frameworks”, “precision” and “dataset used”. The resultant database of 100 articles forms the basis of our review presented in Section 5.

3 Background of Deep Learning Methods

In order to facilitate the reader, we provide a background of deep learning architectures used in UI design automation. For a more detailed overview, we refer to [2].

3.1 Building blocks of Deep Networks

The atomic unit of deep learning models is an artificial neuron that is mathematically represented as a numerical value calculated using a function. The parameters to the function are a set of numerical values or matrices, called as weights and biases, denoted by the set WW and BB respectively. The parameters are learned by a neural network during training. The activations are the non-linear functions, which map the input to a non-linear space and responsible for advancing a neural network from the linear function to a universal function approximator. Sigmoid, Softmax, Tanh, and ReLU [3] are the most commonly used activation functions in neural networks. The optimization process is the backbone of any DL model as it enables the learning of parameters. Gradient descent [4] is used as a common optimization algorithm to calculate the gradients of a loss function with respect to parameters and adjust their values iteratively so that the given cost-function is minimized.[5]. The cost function, also known as loss function calculates the error during the training process of the model. Loss functions are of many types, depending on the nature of the task e.g., Mean Squared Error (MSE) [6] is used for the regression task because it calculates the difference between the observed values (from the dataset) and the predicted values (from the neural network).

We will now explore some popular architectures of neural networks commonly used in UI design automation.

3.2 Deep Neural Networks

Deep neural network architectures, inspired from human nervous system design, consist of many layers responsible for information retrieval, concise representation and semantic evaluations. The UI design data provided as input to these dense layers contain different properties of UI design elements, code instructions and images (UI design screenshots). In each layer, a set of linear operations, wrapped by the activation functions act()act(), the inputs XX and parameters θ\theta are computed. These parameters include weights WW and the biases BB. Equation 1 shows the formula for an individual neuron (nin_{i}).

ni=act(wixi+bi)n_{i}=act(w_{i}*x_{i}+b_{i}) (1)

When various neurons are combined in a layer; the values for these neurons are calculated using the weights matrix WW and biases BB. After the values are computed through hidden layers, they are further connected to the output neurons. These are the predicted outputs against our input data. These values are then compared against observed/target outputs using a proper cost function cc, such as mean squared error 𝔼(YY^)\mathbb{E}(Y-\hat{Y}), where YY represent labels and Y^\hat{Y} are the predictions. After that, the optimization of the parameters has been performed using the decided optimization process, which is mostly the backpropagation δL=cact(WX+B)\delta_{L}=c^{\prime}\odot act(W*X+B)^{\prime} [5]. This process is known as training of a neural network. The aforementioned neural network is also called a fully connected network (FCN) because every neuron in any network layer is connected to all the neurons in its previous layer. A simple structure of the neural network is given in Figure 1. These neural networks can outperform all other ML models when there is enough data for training.

Refer to caption
Figure 1: The structure of a deep neural network consists of one input layer, two hidden layers, and a single output layer. Every layer in this network except the final output layer contains n number of nodes. Every node in the layer is multiplied by some weights and passes through an activation function and become the values of the next layer’s nodes.

3.3 Convolutional Neural Networks

The imagery design dataset can be used for classification in various application categories, such as gaming, health-care, and shopping applications. The on-screen element’s detection and application content classification can be performed using CNN. Various HCI interfaces utilize user eye movement and gesture detection, whereas convolutional networks can handle dense datasets with a lot of variances.

The structure and functioning of deep convolutional neural networks (CNN) are inspired by the human visual cortex, which is able to recognize objects by using feature models. In CNN, feature extraction is highly dependent on the filters which are convolved with images to produce feature maps. Consequently, only relevant features are retained in feature maps retrieved after convolutional operations. While working with image data, often stored in the form of multidimensional arrays having different numerical values (0 to 255) at every pixel, the convolution procedure is initiated by selecting parameters such as the size of filters and receptive field are chosen. One may think of the receptive field as a rectangular window of pixels in the image. The data in the receptive field and kernels are convolved together to produce a new single output value of a resultant matrix. Then the receptive field moves further with some value known as a stride. This operation continues until the receptive field completely covers the whole image for all filters. If needed, the padding is also added on the edges of the image so that the information present at the image boundary is also included. This whole operation constitutes one convolutional layer. After that, an optional pooling layer is added in the network to summarize the features in the image, keep the spatial variance of objects in the image, and for reducing the matrix size by choosing either maximum value or mean of values from again a receptive field (in this case, called grid). Subsequently, the data is passed through multiple convolutional and pooling layers. The resultant matrix is then flattened by the network into a vector further connected with the aforementioned DNN. The training process is similar to a deep neural network, noted that in a CNN, the weights are the values in filters, which are learned through the optimization process. A simple architecture of CNN is given in Figure 2. Moreover, there are various CNN architectures used for multiple tasks, some popular architectures are LeNet [7], AlexNet [8], VGG- 16 [9], ResNet [10], GoogLeNet [11].

Refer to caption
Figure 2: A convolutional neural network having convolved feature maps and pooled feature maps after applying the convolutional layer and pooling. The second last layer of the given CNN is a fully connected layer same as the deep neural network. The CNN in this diagram is performing inference on the picture of a cat. The last output layer contains two nodes which store the probabilities of the input image having a cat or dog.

3.4 Recurrent Neural Network

Programming language scripts are made up of sequential instructions, and descriptions of UIs contains sequential text data. The sequential data is easily understandable by RNN which learns to query UI design by natural language and is useful for code completion tasks. This can assist software developers and designers while creating new applications. Recurrent Neural Network (RNN) is a type of sequence model in DL. Sequence models deal with sequential or temporal data. For instance, the reading of English alphabets represents the example of the sequential data. Addressing the alphabets in a series is possible for every speaker, but performing the same task in reverse order may be problematic. The human memory stores the characters in a conditional manner where each next item is dependent on the previous one. RNN follows the same intuition to learn by storing the previous information in a special memory cell and using it with the next state’s input for learning and inference. Figure 3 represents the basic architecture of RNN. In each state of RNN, every neuron is connected with the activation unit (hidden layer) and the activation originating from the previous state. Subsequently, these inputs formulate the output of the current state. In this way, the output of an RNN is not a pure function of inputs, but it is also dependent on previous activations. When the training process of RNN is initialized, the activation unit takes an input from zero vector a<0>a<0> considering it a previous unit. Through this structure, RNNs are capable of preserving the information of their internal states. There are two types of weights in RNN, WaaW_{aa} and WaxW_{ax}, where WaaW_{aa} is responsible for the transfer of data through the previous activation unit to the current activation unit, and the WaxW_{ax} is for transfer of data from inputs to activation units. RNNs have low computational cost because the weights are shared among all the states. The loss function of a recurrent network is the sum of each loss at every time step. The backpropagation of RNN is named as Backpropagation Through Time [12]. This backpropagation can mostly cause vanishing and exploding gradient problems. The exploding gradient is when weights become very large and disturb the training of the whole neural network. One way to deal with this problem is the gradient clipping strategy [13]. Vanishing gradient problem occurs when the gradients become very small (equal to zero), and training stops. The vanishing gradient issue can be resolved by using various variants of RNN. The most popular and advanced variants of RNNs are Long Short-Term Memory (LSTM) [14] and Gated Recurrent Units (GRU) [15].

Refer to caption
Figure 3: The architecture of a recurrent neural network (RNN) consists of n number of input nodes, but the difference between RNN and DNN is that the hidden nodes (known as hidden cells) are connected. The next node’s output is not a pure function of input and activations, but it takes information from the previous “hidden cell” [16].

3.5 Autoencoders

Autoencoders belong to the family of self-supervised learning algorithms. Autoencoders encode a compressed representation of data and therefore, used for dimensionality reduction of complex data. In an autoencoder, two different neural networks are joined together via a bottleneck layer (Figure 4) which is similar to the hidden layer of any neural network. The bottleneck layer is used to compress the input data and plays an important role so that the network should not memorize the input values by mapping them to output neurons. In this way, the number of neurons in the bottleneck layer is much lesser than any other layers in the autoencoder. The left part of an autoencoder ((Figure 4)) is known as the encoder part, which compresses the data into a compact representation with weights, biases, and activation functions. After the data is passed through the encoder and bottleneck layer, it is treated as input for the decoder part for the decompression of the data. An autoencoder’s loss function is a reconstruction error, measured by calculating the difference between original inputs and generated outputs (which are recreated inputs). Similar to other neural networks, the backpropagation is used for the training purpose. Autoencoders are comprised of two DNNs, these networks can be any type of neural networks such as convolutional networks as an encoder and convolutional transpose network as a decoder. There are multiple variants of autoencoders for various tasks such as, Denoising Autoencoders [17], Sparse Autoencoders [18], Stacked Autoencoders [19], and Variational Autoencoders [20].

Refer to caption
Figure 4: Simple autoencoder model with an encoder and decoder part. Both the encoder and the decoder have two hidden layers. The encoder part receives input and down scales it towards the bottleneck part that contains the compressed representation of the data. This compressed form is then passed through the decoder, which upscales it to reconstruct the input.

3.6 Generative Adversarial Networks

Generative Adversarial Networks (GANs) are from the family of self-supervised learning algorithms, developed by Ian Goodfellow [21]. In GANs, two neural networks are competing against each other for their improvement. As displayed in Figure 5, one is the generator, and the other is the discriminator. The task of the generator is to generate some data given a random noise, and the task of the discriminator is to discriminate between real and generated data. The analogy for understanding the GAN is of considering this network competition between a forger and an investigator. The forger is trying to create fake money, and the investigator has to classify either the money is real or fake. As training continues, both parties keep getting better at their work. When the training finishes, the forger (generator) can generate the money (data) that is indistinguishable from the real money (data). These two networks are competing with each other in an adversarial manner, and this type of learning is called adversarial learning. The cost function of a GAN has two parts; the discriminator’s loss function is its ability to distinguish real data from generated data, and the generator’s loss function is the disability of the discriminator to recognize generated data. The procedure of training a GAN given as follows: let v\vec{v} be the random vector drawn from a Gaussian distribution. The generator takes this v\vec{v} as input and tries to map it to the distribution of the real data RR. During this process, the fake data FF is generated by the G(v)G(\vec{v}). Both the datasets are then classified by the discriminator DD, such as D(R)=1D(R)=1 and D(F)=0D(F)=0. However, the optimization of the generator’s weights is for the matching of fake data to real data, so the generator tries to achieve the D(F)=1D(F)=1. The formulation of loss functions for GG and DD are given in equation 2.

maxL(D;θD)\displaystyle\max L(D;\theta_{D}) =𝔼rR[logD(r)]+𝔼fF[log1D(f)]\displaystyle=\mathbb{E}_{r\sim R}[\log{D(r)}]+\mathbb{E}_{f\sim F}[\log{1-D(f)}] (2)
maxL(G;θG)\displaystyle\max L(G;\theta_{G}) =𝔼fF[logD(f)]\displaystyle=\mathbb{E}_{f\sim F}[\log{D(f)}]

The loss functions represent the maximization process of GG and DD; both the neural networks have to optimize their parameters θG\theta_{G} and θD\theta_{D}, respectively. The discriminator’s loss function depends on real data, RR, and fake data FF. However, the generator only needs the predictions of discriminator on fake data D(F)D(F), where F=G(v)F=G(\vec{v}). Using backpropagation, this value is optimized for both networks. The generator has no access to real data, but it still manages to get close towards the actual data, this means that the generator is learning with the feedback from the discriminator. The stability is the biggest challenge of training GANs. If one of these networks outsmarts the other one, the training will fail; for example, if discriminator is overly qualified in discriminating data points, then it will always perform the correct classification, and there is no chance of optimization for generator. Similarly, if the generator becomes over-smart, then it can never achieve diversity in generating data. The ideal point of GAN’s training is to reach an equilibrium point known as Nash equilibrium [22].

Refer to caption
Figure 5: The simple architecture of the generative adversarial network (GAN) in which two neural networks are competing with each other to perform better in the adversary system of minimax game. The generator tries to generate data from random vector space, and the discriminator tries to detect either the data is coming from real data or a generator. In this competition, the generator learns the distribution of real data, and it can generate realistic data.

4 Datasets for UI design Automation

In this section, we discuss some notable datasets for developing data-driven models for UI automation tasks. In many cases, datasets used by researchers are not publicly available and therefore, we have considered the open-source datasets which are already being used for various applications. Figure 6 shows the contributions of UI design data sets on a timeline.

ICDAR 2015

This is the dataset collected by the Pattern Recognition and Image Analysis (PRImA) [23] and used in a competition for the segmentation and recognition of textual information on documents. The images of documents contained well-structured layouts of elements such as pictures and text. ICDAR 2015 dataset contains 478 labeled documents [24]. This dataset is used in research for understanding the design of single-page document layouts using DL models [25].

ERICA

This dataset was introduced in 2016 and contained the screenshots of the mobile application’s UIs and user interaction traces on these mobile applications. ERICA has almost 18.6k unique UIs and 50k user interactions collected from 2.4k apps from Google Play Store. ERICA was collected using a dynamic approach when users interact with applications through a web client, and the system captures both the snapshots and user interactions and stores them accordingly [26].

RICO

Rico is currently the largest repository of mobile application UI designs. It is 4x of the size of ERICA. Rico was presented in 2017 with almost 72.2k snapshots, 10.8k traces of user interactions, and approximately 3 million labeled GUI components from 9.7k applications of Google Play Store. The collection of Rico was in a dynamic manner where users, who are the 13 workers recruited on Upwork, and an automated agent performed mining over the downloaded applications. Rico is currently the most extensive dataset for UIs of mobile applications [27].

REDRAW Dataset

This is the dataset produced during a study of automating the process of UIs for mobile applications. REDRAW contains 14.3k different screens and 191.3k labeled GUI components. They collected the dataset from the top 250 apps from all the Google Play Store’s categories except games. The GUI elements are labeled in 15 different categories for classification purposes [28].

CTXFonts Dataset

This dataset was deployed in 2018 and contains screenshots for web designs with labeled text elements. The labels include various font properties such as font face, font size, and font colors. CTXFonts-dataset captures almost 1065 web-designs, 4893 text elements, and 492 font faces, which can be used for various data-driven applications [29].

Zheng et al. Datasets

This dataset was open-sourced in 2019 for UI designs and consists of 3,919 magazine page designs. These magazine pages were well aligned with respect to contents. The data set provides semantic annotations of six elements i.e., Text, Image, Headline, Text-over-image, Headline-over-image, and Background [30].

Refer to caption
Figure 6: Timeline of user interface related dataset, starting from 2015 to 2019. The figure includes open-source datasets that are useful for deep learning applications. The datasets mostly include the imagery data with some additional features such as labels or text descriptions.
Table 3: A table representing the use cases related to datasets, mentioned in section 4. Rico is the most popular dataset, and it was used in multiple UI design studies. Most of the mentioned datasets are used for the empirical validation or testing of the system. ICDAR 2015 is a benchmark dataset for text detection.

Dataset Use Case RICO [27] Category Description Behavioral Studies User’s response on aesthetics of mobile applications [31]. User’s concerns about login features in applications [32]. Empirical Validations Object detection on the user interfaces [33]. UI Design querying using autoencoders [34]. Application Testing Black-box app testing with deep learning [35]. Magazine Dataset [30] Style-Specific Layout Generation Single page design generations [30]. Synthesis of the advertising images [36]. Validation of constraint-based layout generation [37]. CTX Fonts Dataset [29] Font Analysis Comparison of two major font styles, Serif and Sans [38]. Font Predictions Font’s properties prediction using deep learning [29]. ReDRAW Dataset [28] Code Generation From UI designs to code generation [28]. UI Content Analysis Detecting and summarizing of GUI changes [39]. Inspires an automated widget recognition system [40]. Empircial Validations Used for empirical evaluation of UI object detection system [33]. ICDAR 2015 [25] Text Detection A benchmark dataset for text detection, accuracy of 90% achieved by CharNet [41].

5 Deep Learning in UI Designs

In this section, we will explore the research already being carried out in the user interface design using DL. Table 4 classifies the previous UI automation studies among different deep learning architectures. However, we will sequentially discuss them. We have covered the following attributes for every paper, the DL algorithm used in research, and the dataset used by researchers.

Before the development of deep learning, various empirical studies were carried out for the automation of different systems. Biswas et al. [42] and Porta et al. [43] reviewed methods and techniques for designing interfaces, while Jalaliniya et al. [44] built a touch-less interface for interacting with medical images. The automation of designing and the facilitation of easy access to systems other than GUI systems is a promising area of research. Fields in this area include mobile-based augmented reality as well as virtual reality [45, 46, 47], text-based insertion [48], brain-computer interfaces [49] and keystroke systems [50]. Aside from this, Liu et al. [51] used machine learning algorithms to study the usage patterns of App Store profiles and [52, 53, 54, 55, 56] used them for intelligent system designs and understanding a user’s behavior while interacting with UIs. Gollan et al. [57] found that user’s attention can be maintained by understanding the cognitive load of a person from his pupil dilation. Kolthof et al. [58] generated GUI prototypes from text requirements using NLP techniques. Moreover, testing of the android applications was also automated by a framework developed by Wu et al. [59].

In 2018 Kevin et al. [28] develop a tool named REDRAW for the android platform. They implemented various ML techniques in their tool. Their model needs mockup artifacts of designs as an input to their REDRAW pipeline. Firstly, the computer vision techniques are used to detect the GUI components in screenshots of multiple android apps; after that, CNNs were implemented to classify the identified components accordingly. Finally, they used the k-nearest Neighbor (KNN) to assemble the elements precisely. The dataset in this work was built on the top 250 Android applications and collected about 14,382 different screens and 191,300 labeled interface components. Nguyen et al. [60] in 2018, developed a system for UI design automation. Firstly, a RNN is trained to query UIs based on text descriptions. They used these gathered UIs to generate UI designs using GANs. The authors had created the dataset of multiple applications UIs and manually added the descriptions of these UIs. Kevin et al. proposed two different models: (i) NaturalUI, which can query design using textual query; (ii) GenUI generates images of UIs. In 2018 Liu et al. [61] used various DL techniques to create a model that could assist UI designers. The dataset of labeled UI components had been created, and then trained a CNN on that dataset to differentiate the icon classes. During this study, an autoencoder was trained to learn the vector representation of the Rico dataset’s semantic layouts, which was later used with the KNN to query UIs. However, the decoder part of the trained autoencoder can generate UIs based on its learning. This work is only for the user interface designs of android applications. Zhao et al. [29], in 2018, developed a multitask neural network to predict the font properties of text elements in web interfaces. Font properties include three things font face, font color, and font size. They worked on a dataset of 1k web designs, having labeled text elements against their font properties. The CNN and autoencoders were used along with the adversarial training process to create a novel predictive model.
Wu et al. [62] proposed a study to perceive mobile application rating using a machine learning-based model. 318 mobile apps from the Rico dataset were selected for this study. The freelancers were hired to rate these applications based on colors, textures, and layouts; then they produced a predictive model based on the labeled dataset created by the freelancers. They used 5 machine learning models, while carrying out a comparative study on these models for the design rated perceptive. They used the following models: i) multiple linear regression ii) lasso linear regression iii) multi-layer perceptron iv) decision tree v) random forest for the prediction of the personality of apps. Their analysis is based on 15 different variables and five dimensions, which can affect any device’s ranking. In May 2019, Huang et al. [63] used several ML methods to develop a system that can retrieve the UIs of mobile application using the sketches as input. These ML techniques include the Bag of Words (BOW) and the Histogram of Oriented Gradients (HOG) filters. Their contribution is the detection of similarities between the original UI and the sketch image. The developed product was a UI querying system that was built by the combination of DNN and KNN to scan for similar UIs, rank them accordingly, and display them in order. The querying framework cannot work with the complex UI systems. Rico dataset was used to select the interfaces and create the sketches of these interfaces. In 2019 Zheng et al. [30] used the strategy of GANs to generate magazine layouts. The generator works together with three different encoders for accurate generation or alignment of elements in the interfaces. They picked three elements, such as pictures, keywords, and type (category) of the document. These three elements have their encoders, which operate with the GAN’s generator to trick the discriminator. The model learned the structure and elements from layouts. This research was not specific for mobile applications, as the dataset of magazine documents had been created by using masks produced by semantic segmentation of the element. Patil et al. [24] generated UI designs with a combination of the Variational Autoencoders (VAEs) and the RNNs. The dataset used in this study is of single-page documents named ICDAR explained in the datasets section 4. A new metric was introduced for measuring the diversity and uniqueness of UI layouts. The RNN is used to breaks down the relative element in layouts into various chunks. These elements are then treated as sequential data for the training of a variational autoencoder (VAE), comprised of Spatial-relation Encoder and Decoder (SRE / SRD). However, this system was unable to handle the complex structures of UI designs. Gaussian distribution was used as the weights initializer to mitigate the problem of vanishing and exploding gradients. Li et al. [64] utilized the GANs to generate the layouts of mobile applicaions by understanding how their layouts were organized to the graphic elements in them. Using DCGAN, the models learn from semantic data masks; they created two distinct discriminators that are based on two separate techniques: (i) relation based (ii) wireframe. In relation-based discriminator, the model tries to understand the relationship between the elements using different classifiers. In the latter, the wireframes are rendered by converting the images of the dataset to grayscale, and then, these wireframes are trained on a CNN to discriminate the real/fake images. 25k templates were built from existing documents and validity was checked on Rico semantic layout wireframes. Yet one thing to note here is that this work is not limited to smartphone devices or webpages. Lee et al. [65] used the Conditional GANs (CGANs) to generate the color palette for the interfaces of the Android applications, using the design semantics as conditions for CGANs. The analysis, in this study, yields promising results, but the research covers just a single component of the UI, color. The authors had gathered the GUI design dataset from LG Electronics to train their network. Schlatter et al. [66, 67] developed a method for pixel-accurate implementation of any UI design from its screenshot. They cover various UI elements e.g., border properties, frame properties, color properties, and text properties. Imitation learning achieved up to 94.8% accuracy in inferring the value of attributes in the given image. A new cost function had been proposed for estimating the values of attributes. The methodology was: First, the picture goes to model and passes through several CNNs, then each CNN detects the quantity of the attributes in the images. These values are used to regenerate/render the interface after CNN’s predictions; these generated layouts are then compared to the original ones for training. A dataset was collected from Google Play Store’s Android applications, and a synthetic dataset was also produced while training. This work was a positive step towards the design automation work.
Lee et al. [68] developed a tool called GUI-Comp, which works as an extension of the KakaoOven tool to assist graphic designers with tips and feedback in real-time. This program had been tested against inexperienced designers. GUI-Comp includes three panels to support the designers. Recommendation panel: trained on the Rico dataset, utilizing the stacked VAEs along with KNNs to recommend the elements that should be used for interface architecture. Attention Panel: Uses a pre-trained, fully connected neural network (FCN) to predict the user’s attention on the created design. The FCN was trained on graphic design importance (GDI) dataset for providing the heatmap that reflects the user’s attention. Evaluation Panel: it contains multiple metrics that compute the overall score of design based on the element’s properties and adjustments.

\newcolumntype

C ¿\arraybackslash m4cm \newcolumntypeD ¿\arraybackslash m1cm

Table 4: Studies classified by the deep learning methods

Supervised Learning Deep Neural Network (DNN) Huang et al. [63] Sketch-based querying of UIs Lee et al. [68] Predict the user’s attention on a UI Villegas et al. [69] Identify cognitive goals of users Kapoor et al. [70] User’s interaction frustation prediction Vizer et al. [71] Stress detection using keystrokes Von et al. [72] controlled interaction for map-based applications Lian et al. [73] Cutomized font creation Li et al. [74] review on intelligent manufacturing Liu et al. [75] Study of AI-based software generation Ma et al. [76] API extraction using transfer learning Wu et al. [62] Prediction of app score Wenyin et al. [77] Shapes recognition from sketches Zou et al. [78] Framework for overlap hand-writing in smartphone UIs. Ahmed et al. [79] Error-based code generation Lavania et al. [80] Biological laboratory assistance using DNN Wen et al. [81] Smart touchpads for smartphones Jungwirth et al. [82] Industry’s worker assistance by eye tracking Kim et al. [83] Smart wearable device for visually impared people Convolutional Neural Network (CNN) Moran et al. [28] UI element’s assembling after detection Dou et al. [84] Quantification of website aesthetics Halter et al. [85] Annotation tool for films Bao et al. [86] Programming code extraction Han et al. [87] 3D sketching system Bell et al. [88] product visual similarity Nishida et al. [89] sketching of urban models Shao et al. [90] Semantic modeling of indoor scenes Schlattner et al. [66] Prediction of the element’s property value Bylinskii et al. [91] User’s focus areas prediction Liu et al. [61] Created semantic annotations for Rico dataset. Yeo et al. [92] Pose recognition by using wearable device Kong et al. [93] Smart glass UI for the selection of home appliances Mairittha et al. [94] Mobile UIs personalization detection and perdiction Stiehl et al. [95] UI for sign wirting (hand gesture) detection Tensmeyer et al. [96] Font recognition and classification Recurrent Neural Network (RNN) Wang et al. [97] Navigation using natural language Fowkes et al. [98] Automatic folding of IDE code Nguyen et al. [60] Query UI with textual input Mohian et al. [99] Code generation from UI sketches Zhang et al. [100] Automated smart text correction system Huang et al. [101] Sketch generation from natural language descriptions Sun et al. [102] Controlling mobile functions from lip movement Zhou et al. [103] Code completion based on previous code Bhatt et al. [104] Invoice document structure recognition Self-Supervised Learning Autoencoders Liu et al. [61] Understand the representation of UIs in Rico dataset Patil et al. [24] Variational Autoencoders to generate single page UI documents Tufano et al. [105] Bug fixing using neural machine translation Lekchas et al. [106] Unsupervised technique for visual pattern search Lee et al. [68] Stacked VAEs to recommend the elements while creating a UI design Zhao et al. [29] Font Prediction on web designs with adversarial training Chen et al. [67] Automated GUI reconstruction from pre-designed UI’s image Ge et al. [107] Query android UIs from sketches Generative Adversarial Network (GAN) Nguyen et al. [60] GenUI to generate UI designs from the custom dataset Zheng et al. [30] Generation of magazines by given images, text and category Li et al. [64] Generate layouts of UI designs using various Discriminators Zhang et al. [108] Software for colorization of arts Sun et al. [109] Automated drawing system trained on cartoon images Zhang et al. [110] People attention detection using advesarial networks Lee et al. [111] Conditional GANs for mobile content organization Lee et al. [65] Conditional GANs to generate the color palette for apps

6 Discussion and Research Frontiers

In this paper, we reviewed important contributions in the area of DL towards UI design. We discussed DL architectures across emerging methods ranging from the DNNs to GANs. Since each previous contribution involves different datasets, preprocessing methods, performance metrics, and DL models, it is challenging to generalize and draw conclusions about the performance of any particular approach. Therefore, the comparison presented in this paper encompasses different methods and their reported performance on their given dataset.
The CNNs based element- wise classification has reached almost to the 95% accuracy [28, 66]. Similar to the CNN based models, the querying approaches that use RNNs have comparable performance with an additional benefit that they can adapt to textual data as well as design sketches. Unlike the discriminative studies, the contributions that use generative modeling [24, 29, 30, 60, 61, 64, 65, 68] for UI generation tasks are also giving promising results.

However, there are still many gaps in the design automation area of human-computer interaction. The utilization of DL frameworks can be used as a solution for various open research problems. The detailed discussion of these problems with the possible solutions and essential future directions in this area are discussed below.

6.1 Repositories of Cross-Platform Designs

Large and frequently revised benchmark datasets are critical to advance the design automation process to gain real trust in DL methods. A high-quality imagery dataset is available for only android mobile applications [27]. However, the area of cross-platform interface designs is also uncovered. The research for iOS and desktop application designs is very limited. The website designs are also ignored to an extent. This gap can be bridged by the iOS user community to build a dataset or make the already existing design dataset of interfaces available, so researchers can contribute and explore this area on the iOS platform. There is also a need for cross-platform UI/UX dataset. The cross-platform designs are necessary because Android OS has only a 36% market share. The other operating systems are also important and can not be ignored; from the statistics of [112], it can be seen that Windows, iOS has 36% and 14% market shares, respectively. Data-driven applications can only be possible by using these datasets. The huge repositories are very much needed for the support of DL models.

6.2 Advanced UI Generation

Current research in UI design generation is very limited. With the availability of generative models, the data generation process is automated in every industry. A team of scientists and researchers, inspired by MIT, work specifically on the use of generative modeling in different fields of science and arts [113]. They have been working for the generation of perfumes, viruses, music, and fashion. These advance usages of generative modeling can also be used for the generation of advanced UI designs. The dynamic elements can also be treated as variables in the design process because the recent work carried out in GANs can generate high definition images with control over features or elements. Apart from this High-quality UI content generation, the requirement specification process can also be changed using the most recent contributions in the field text to image translation such as [114]. The research should not be limited to only image generation methods, but it can be advanced to generate the code for various design languages. These codes can then be modified by various NLP architectures, described in section 3.4, using human-friendly textual input.

6.3 Centralized System

There is also a need for the development of a unified standard in the form of an open-source, cross-platform DL UI design automation framework, which could have various DL strategies implemented in it while covering multiple areas of UI systems. This envisaged framework, if developed, will have both elementwise and layout-wise implementation of DL models that allow users to customize the designs as needed. This system may be in the form of a web application or desktop application. However, it should have the ability to work with the pre-build UI/UX systems such as Adobe XD [115], InVision [116] and Sketch [117]. The idea of the centralized UI system can also be extended for the user experience (UX) research using various ML and DL methods.

The unawareness of novice designers is a reason for the limited use of DL-based UI design automation methods. This problem can be minimized if the popular software development tools such as Android Studio [118], XCode [119], and Microsoft Visual Studio [120] allow the integration of DL based designing models in them.

7 Conclusions and Future Works

In this paper, we presented a comprehensive review of the existing work using deep learning methods towards UI design. Our study involves deep neural networks, convolutional neural networks, recurrent neural networks, autoencoders, and generative adversarial networks. We reviewed different data sets that have been used along with the aforementioned deep learning techniques. Some successful use cases related to UI datasets are discussed in this study. We also discussed some other interface designs that may somehow be related to software designs, such as one-page magazine layouts that are similar to mobile app UI designs. In section 6, we outline some future application areas of UI design automation by using deep learning algorithms. Some important futures extensions of the existing work may include: (i) automation of software testing and user experience, (ii) collection of cross-platform UI design datasets, (iii) application of semi-supervised generative algorithms for synthesis of UI components, (iv) improvement in explainability and interpretability of UI content generation approaches. The use of deep learning for UI design automation tasks could be one of the high potential fields for the advancement in software development, but it is still unexplored. Based on the current and foreseeable future applications, we believe that the deep learning models will transform the software industry’s landscape in general and application design in particular.

References

  • Rumelhart et al. [1985] David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning Internal Representations by Error Propagation. Technical Report ICS-8506, CALIFORNIA UNIV SAN DIEGO LA JOLLA INST FOR COGNITIVE SCIENCE, September 1985. URL https://apps.dtic.mil/docs/citations/ADA164453.
  • Goodfellow et al. [2016] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. Adaptive computation and machine learning. The MIT Press, Cambridge, Massachusetts, 2016. ISBN 978-0-262-03561-3.
  • Nwankpa et al. [2018] Chigozie Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall. Activation Functions: Comparison of trends in Practice and Research for Deep Learning. ArXiv, 2018.
  • Ruder [2017] Sebastian Ruder. An overview of gradient descent optimization algorithms. arXiv:1609.04747 [cs], June 2017. URL http://arxiv.org/abs/1609.04747.
  • Rumelhart et al. [1986] David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagating errors. Nature, 323(6088):533–536, October 1986. ISSN 1476-4687. doi: 10.1038/323533a0. URL https://www.nature.com/articles/323533a0.
  • Botchkarev [2019] Alexei Botchkarev. Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology. Interdisciplinary Journal of Information, Knowledge, and Management, 14:045–076, 2019. ISSN 1555-1229, 1555-1237. doi: 10.28945/4184. URL http://arxiv.org/abs/1809.03006.
  • Lecun et al. [1998] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, November 1998. ISSN 1558-2256. doi: 10.1109/5.726791.
  • Krizhevsky et al. [2012] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. ImageNet Classification with Deep Convolutional Neural Networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012. URL http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf.
  • Liu and Deng [2015] Shuying Liu and Weihong Deng. Very deep convolutional neural network based image classification using small training sample size. In 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pages 730–734, November 2015. doi: 10.1109/ACPR.2015.7486599.
  • He et al. [2016] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, June 2016. doi: 10.1109/CVPR.2016.90.
  • Szegedy et al. [2015] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–9, June 2015. doi: 10.1109/CVPR.2015.7298594.
  • Lillicrap and Santoro [2019] Timothy P Lillicrap and Adam Santoro. Backpropagation through time and the brain. Current Opinion in Neurobiology, 55:82–89, April 2019. ISSN 0959-4388. doi: 10.1016/j.conb.2019.01.011. URL http://www.sciencedirect.com/science/article/pii/S0959438818302009.
  • Pascanu et al. [2013] Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, ICML’13, pages III–1310–III–1318. JMLR.org, June 2013. event-place: Atlanta, GA, USA.
  • Hochreiter and Schmidhuber [1997] Sepp Hochreiter and Jürgen Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735–1780, November 1997. ISSN 0899-7667. doi: 10.1162/neco.1997.9.8.1735. URL https://doi.org/10.1162/neco.1997.9.8.1735.
  • Cho et al. [2014] Kyunghyun Cho, B. van Merrienboer, Caglar Gulcehre, F. Bougares, H. Schwenk, and Yoshua Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), 2014. URL https://nyuscholars.nyu.edu/en/publications/learning-phrase-representations-using-rnn-encoder-decoder-for-sta.
  • Sherstinsky [2020] Alex Sherstinsky. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Physica D: Nonlinear Phenomena, 404:132306, March 2020. ISSN 0167-2789. doi: 10.1016/j.physd.2019.132306. URL http://www.sciencedirect.com/science/article/pii/S0167278919305974.
  • Vincent et al. [2008] Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning, ICML ’08, pages 1096–1103. Association for Computing Machinery, July 2008. ISBN 978-1-60558-205-4. doi: 10.1145/1390156.1390294. URL https://doi.org/10.1145/1390156.1390294. event-place: Helsinki, Finland.
  • Jiang et al. [2013] Xiaojuan Jiang, Yinghua Zhang, Wensheng Zhang, and Xian Xiao. A novel sparse auto-encoder for deep unsupervised learning. In 2013 Sixth International Conference on Advanced Computational Intelligence (ICACI), pages 256–261, October 2013. doi: 10.1109/ICACI.2013.6748512.
  • Vincent et al. [2010] Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. The Journal of Machine Learning Research, 11:3371–3408, December 2010. ISSN 1532-4435.
  • Kingma and Welling [2014] Diederik P. Kingma and Max Welling. Auto-Encoding Variational Bayes. arXiv:1312.6114 [cs, stat], May 2014. URL http://arxiv.org/abs/1312.6114.
  • Goodfellow et al. [2014] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative Adversarial Nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 2672–2680. Curran Associates, Inc., 2014. URL http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.
  • Nash [1950] John F. Nash. Equilibrium points in n-person games. Proceedings of the National Academy of Sciences, 36(1):48–49, January 1950. ISSN 0027-8424, 1091-6490. doi: 10.1073/pnas.36.1.48. URL https://www.pnas.org/content/36/1/48.
  • noa [a] PRImA RDCL2015. a. URL https://www.primaresearch.org/RDCL2015/.
  • Patil et al. [2019] Akshay Gadi Patil, Omri Ben-Eliezer, Or Perel, and Hadar Averbuch-Elor. READ: Recursive Autoencoders for Document Layout Generation. ArXiv, 2019.
  • Antonacopoulos et al. [2015] A. Antonacopoulos, C. Clausner, C. Papadopoulos, and S. Pletschacher. ICDAR2015 competition on recognition of documents with complex layouts - RDCL2015. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pages 1151–1155, August 2015. doi: 10.1109/ICDAR.2015.7333941.
  • Deka et al. [2016] Biplab Deka, Zifeng Huang, and Ranjitha Kumar. ERICA: Interaction Mining Mobile Apps. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology, UIST ’16, pages 767–776. Association for Computing Machinery, October 2016. ISBN 978-1-4503-4189-9. doi: 10.1145/2984511.2984581. URL https://doi.org/10.1145/2984511.2984581. event-place: Tokyo, Japan.
  • Deka et al. [2017] Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hibschman, Daniel Afergan, Yang Li, Jeffrey Nichols, and Ranjitha Kumar. Rico: A Mobile App Dataset for Building Data-Driven Design Applications. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, UIST ’17, pages 845–854. Association for Computing Machinery, October 2017. ISBN 978-1-4503-4981-9. doi: 10.1145/3126594.3126651. URL https://doi.org/10.1145/3126594.3126651. event-place: Québec City, QC, Canada.
  • Moran et al. [2018a] Kevin Moran, Carlos Bernal-Cárdenas, Michael Curcio, Richard Bonett, and Denys Poshyvanyk. Machine Learning-Based Prototyping of Graphical User Interfaces for Mobile Apps. arXiv:1802.02312 [cs], June 2018a. URL http://arxiv.org/abs/1802.02312.
  • noa [b] Modeling Fonts in Context: Font Prediction on Web Designs. b. URL http://nxzhao.com/projects/font_in_context/.
  • Zheng et al. [2019a] Xinru Zheng, Xiaotian Qiao, Ying Cao, and Rynson W. H. Lau. Content-aware generative modeling of graphic design layouts. ACM Transactions on Graphics, 38(4):133:1–133:15, July 2019a. ISSN 0730-0301. doi: 10.1145/3306346.3322971. URL https://doi.org/10.1145/3306346.3322971.
  • von Wangenheim et al. [2018] Christiane G. von Wangenheim, João V. Araujo Porto, Jean C. R. Hauck, and Adriano F. Borgatto. Do we agree on user interface aesthetics of Android apps? arXiv:1812.09049 [cs], December 2018. URL http://arxiv.org/abs/1812.09049. arXiv: 1812.09049.
  • Micallef et al. [2018] Nicholas Micallef, Erwin Adi, and Gaurav Misra. Investigating Login Features in Smartphone Apps. In Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers, UbiComp ’18, pages 842–851. Association for Computing Machinery, October 2018. ISBN 978-1-4503-5966-5. doi: 10.1145/3267305.3274172. URL https://doi.org/10.1145/3267305.3274172. event-place: New York, NY, USA.
  • Chen et al. [2020a] Jieshan Chen, Mulong Xie, Zhenchang Xing, Chunyang Chen, Xiwei Xu, and Liming Zhu. Object Detection for Graphical User Interface: Old Fashioned or Deep Learning or a Combination? arXiv:2008.05132 [cs], August 2020a. doi: 10.1145/3368089.3409691. URL http://arxiv.org/abs/2008.05132. arXiv: 2008.05132.
  • Chen et al. [2020b] Jieshan Chen, Chunyang Chen, Zhenchang Xing, Xin Xia, Liming Zhu, John Grundy, and Jinshui Wang. Wireframe-based UI Design Search through Image Autoencoder. ACM Transactions on Software Engineering and Methodology, 29(3):19:1–19:31, June 2020b. ISSN 1049-331X. doi: 10.1145/3391613. URL https://doi.org/10.1145/3391613.
  • Li et al. [2019] Yuanchun Li, Ziyue Yang, Yao Guo, and Xiangqun Chen. Humanoid: A Deep Learning-Based Approach to Automated Black-box Android App Testing. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 1070–1073, November 2019. doi: 10.1109/ASE.2019.00104.
  • You et al. [2020] Wei-tao You, Hao Jiang, Zhi-yuan Yang, Chang-yuan Yang, and Ling-yun Sun. Automatic synthesis of advertising images according to a specified style. Frontiers of Information Technology & Electronic Engineering, August 2020. ISSN 2095-9230. doi: 10.1631/FITEE.1900367. URL https://doi.org/10.1631/FITEE.1900367.
  • Lee et al. [2020a] Hsin-Ying Lee, Lu Jiang, Irfan Essa, Phuong B. Le, Haifeng Gong, Ming-Hsuan Yang, and Weilong Yang. Neural Design Network: Graphic Layout Generation with Constraints. arXiv:1912.09421 [cs], July 2020a. URL http://arxiv.org/abs/1912.09421. arXiv: 1912.09421.
  • Shinahara et al. [2019] Yuto Shinahara, Takuro Karamatsu, Daisuke Harada, Kota Yamaguchi, and Seiichi Uchida. Serif or Sans: Visual Font Analytics on Book Covers and Online Advertisements. In 2019 International Conference on Document Analysis and Recognition (ICDAR), pages 1041–1046, September 2019. doi: 10.1109/ICDAR.2019.00170.
  • Moran et al. [2018b] Kevin Moran, Cody Watson, John Hoskins, George Purnell, and Denys Poshyvanyk. Detecting and summarizing GUI changes in evolving mobile apps. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, pages 543–553. Association for Computing Machinery, September 2018b. ISBN 978-1-4503-5937-5. doi: 10.1145/3238147.3238203. URL https://doi.org/10.1145/3238147.3238203. event-place: New York, NY, USA.
  • Yu et al. [2019] Shengcheng Yu, Chunrong Fang, Yang Feng, Wenyuan Zhao, and Zhenyu Chen. LIRAT: Layout and Image Recognition Driving Automated Mobile Testing of Cross-Platform. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 1066–1069, November 2019. doi: 10.1109/ASE.2019.00103.
  • Xing et al. [2019] Linjie Xing, Zhi Tian, Weilin Huang, and Matthew Scott. Convolutional Character Networks. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 9125–9135, October 2019. doi: 10.1109/ICCV.2019.00922.
  • Biswas et al. [2012] Pradipta Biswas, Peter Robinson, and Patrick Langdon. Designing Inclusive Interfaces Through User Modeling and Simulation. International Journal of Human–Computer Interaction, 28(1):1–33, January 2012. ISSN 1044-7318. doi: 10.1080/10447318.2011.565718. URL https://doi.org/10.1080/10447318.2011.565718. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/10447318.2011.565718.
  • Porta [2002] MARCO Porta. Vision-based user interfaces: methods and applications. International Journal of Human-Computer Studies, 57(1):27–73, July 2002. ISSN 1071-5819. doi: 10.1006/ijhc.2002.1012. URL http://www.sciencedirect.com/science/article/pii/S1071581902910128.
  • Jalaliniya et al. [2013] Shahram Jalaliniya, Jeremiah Smith, Miguel Sousa, Lars Büthe, and Thomas Pederson. Touch-less interaction with medical images using hand &amp; foot gestures. In Proceedings of the 2013 ACM conference on Pervasive and ubiquitous computing adjunct publication, UbiComp ’13 Adjunct, pages 1265–1274, New York, NY, USA, September 2013. Association for Computing Machinery. ISBN 978-1-4503-2215-7. doi: 10.1145/2494091.2497332. URL https://doi.org/10.1145/2494091.2497332.
  • Ye et al. [2020] Hui Ye, Kin Chung Kwan, Wanchao Su, and Hongbo Fu. ARAnimator: in-situ character animation in mobile AR with user-defined motion gestures. ACM Transactions on Graphics, 39(4):83:83:1–83:83:12, July 2020. ISSN 0730-0301. doi: 10.1145/3386569.3392404. URL https://doi.org/10.1145/3386569.3392404.
  • Caggianese et al. [2020] Giuseppe Caggianese, Nicola Capece, Ugo Erra, Luigi Gallo, and Michele Rinaldi. Freehand-Steering Locomotion Techniques for Immersive Virtual Environments: A Comparative Evaluation. International Journal of Human–Computer Interaction, 36(18):1734–1755, November 2020. ISSN 1044-7318. doi: 10.1080/10447318.2020.1785151. URL https://doi.org/10.1080/10447318.2020.1785151. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/10447318.2020.1785151.
  • Ren et al. [2010] Jinchang Ren, Theodore Vlachos, and Vasileios Argyriou. Immersive and perceptual human-computer interaction using computer vision techniques. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pages 66–72, June 2010. doi: 10.1109/CVPRW.2010.5543161. ISSN: 2160-7516.
  • Jin et al. [2017] Zeyu Jin, Gautham J. Mysore, Stephen Diverdi, Jingwan Lu, and Adam Finkelstein. VoCo: text-based insertion and replacement in audio narration. ACM Transactions on Graphics, 36(4):96:1–96:13, July 2017. ISSN 0730-0301. doi: 10.1145/3072959.3073702. URL https://doi.org/10.1145/3072959.3073702.
  • Zhong et al. [2020] Saisai Zhong, Yadong Liu, Yang Yu, Jingsheng Tang, Zongtan Zhou, and Dewen Hu. A Dynamic User Interface Based BCI Environmental Control System. International Journal of Human–Computer Interaction, 36(1):55–66, January 2020. ISSN 1044-7318. doi: 10.1080/10447318.2019.1604473. URL https://doi.org/10.1080/10447318.2019.1604473. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/10447318.2019.1604473.
  • Yang et al. [2020] Dongseok Yang, Kanghee Lee, and Younggeun Choi. TapSix: A Palm-Worn Glove with a Low-Cost Camera Sensor that Turns a Tactile Surface into a Six-Key Chorded Keyboard by Detection Finger Taps. International Journal of Human–Computer Interaction, 36(1):1–14, January 2020. ISSN 1044-7318. doi: 10.1080/10447318.2019.1597573. URL https://doi.org/10.1080/10447318.2019.1597573. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/10447318.2019.1597573.
  • Liu et al. [2018a] Xuanzhe Liu, Huoran Li, Xuan Lu, Tao Xie, Qiaozhu Mei, Feng Feng, and Hong Mei. Understanding Diverse Usage Patterns from Large-Scale Appstore-Service Profiles. IEEE Transactions on Software Engineering, 44(4):384–411, April 2018a. ISSN 1939-3520. doi: 10.1109/TSE.2017.2685387. Conference Name: IEEE Transactions on Software Engineering.
  • Zheng et al. [2019b] Wujie Zheng, Haochuan Lu, Yangfan Zhou, Jianming Liang, Haibing Zheng, and Yuetang Deng. iFeedback: Exploiting User Feedback for Real-Time Issue Detection in Large-Scale Online Service Systems. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 352–363, November 2019b. doi: 10.1109/ASE.2019.00041. ISSN: 2643-1572.
  • Gu and Kim [2015] Xiaodong Gu and Sunghun Kim. ”What Parts of Your Apps are Loved by Users?” (T). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 760–770, November 2015. doi: 10.1109/ASE.2015.57.
  • Wang [2012] Zongjiang Wang. E-book recommender system design and implementation based on data mining. In Fourth International Conference on Machine Vision (ICMV 2011): Computer Vision and Image Analysis; Pattern Recognition and Basic Technologies, volume 8350, page 835017. International Society for Optics and Photonics, January 2012. doi: 10.1117/12.920279. URL https://www.spiedigitallibrary.org/conference-proceedings-of-spie/8350/835017/E-book-recommender-system-design-and-implementation-based-on-data/10.1117/12.920279.short.
  • Zhang and Guo [2019] Xiong Zhang and Philip J. Guo. Mallard: Turn the Web into a Contextualized Prototyping Environment for Machine Learning. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, UIST ’19, pages 605–618, New York, NY, USA, October 2019. Association for Computing Machinery. ISBN 978-1-4503-6816-2. doi: 10.1145/3332165.3347936. URL https://doi.org/10.1145/3332165.3347936.
  • Ji [2018] Meichen Ji. UIChecker: An Automatic Detection Platform for Android GUI Errors. In 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), pages 957–961, November 2018. doi: 10.1109/ICSESS.2018.8663923. ISSN: 2327-0594.
  • Gollan et al. [2016] Benedikt Gollan, Michael Haslgrübler, and Alois Ferscha. Demonstrator for extracting cognitive load from pupil dilation for attention management services. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, UbiComp ’16, pages 1566–1571, New York, NY, USA, September 2016. Association for Computing Machinery. ISBN 978-1-4503-4462-3. doi: 10.1145/2968219.2968550. URL https://doi.org/10.1145/2968219.2968550.
  • Kolthoff [2019] Kristian Kolthoff. Automatic Generation of Graphical User Interface Prototypes from Unrestricted Natural Language Requirements. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 1234–1237, November 2019. doi: 10.1109/ASE.2019.00148. ISSN: 2643-1572.
  • Wu et al. [2017] Jingzheng Wu, Shen Liu, Shouling Ji, Mutian Yang, Tianyue Luo, Yanjun Wu, and Yongji Wang. Exception beyond Exception: Crashing Android System by Trapping in ”Uncaught Exception”. In 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP), pages 283–292, May 2017. doi: 10.1109/ICSE-SEIP.2017.12.
  • Nguyen et al. [2018] Tam Nguyen, Phong Vu, Hung Pham, and Tung Nguyen. Deep Learning UI Design Patterns of Mobile Apps. In 2018 IEEE/ACM 40th International Conference on Software Engineering: New Ideas and Emerging Technologies Results (ICSE-NIER), pages 65–68, May 2018.
  • Liu et al. [2018b] Thomas F. Liu, Mark Craft, Jason Situ, Ersin Yumer, Radomir Mech, and Ranjitha Kumar. Learning Design Semantics for Mobile Apps. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology, UIST ’18, pages 569–579. Association for Computing Machinery, October 2018b. ISBN 978-1-4503-5948-1. doi: 10.1145/3242587.3242650. URL https://doi.org/10.1145/3242587.3242650. event-place: Berlin, Germany.
  • Wu et al. [2019] Ziming Wu, Taewook Kim, Quan Li, and Xiaojuan Ma. Understanding and Modeling User-Perceived Brand Personality from Mobile Application UIs. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, pages 1–12. Association for Computing Machinery, May 2019. ISBN 978-1-4503-5970-2. doi: 10.1145/3290605.3300443. URL https://doi.org/10.1145/3290605.3300443. event-place: Glasgow, Scotland Uk.
  • Huang et al. [2019] Forrest Huang, John F. Canny, and Jeffrey Nichols. Swire: Sketch-based User Interface Retrieval. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, pages 1–10. Association for Computing Machinery, May 2019. ISBN 978-1-4503-5970-2. doi: 10.1145/3290605.3300334. URL https://doi.org/10.1145/3290605.3300334. event-place: Glasgow, Scotland Uk.
  • Li et al. [2018] Jianan Li, Jimei Yang, Aaron Hertzmann, Jianming Zhang, and Tingfa Xu. LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators. September 2018. URL https://openreview.net/forum?id=HJxB5sRcFQ.
  • Lee and Cho [2019] Younghoon Lee and Sungzoon Cho. Design of Semantic-Based Colorization of Graphical User Interface Through Conditional Generative Adversarial Nets. International Journal of Human–Computer Interaction, 0(0):1–10, October 2019. ISSN 1044-7318. doi: 10.1080/10447318.2019.1680921. URL https://doi.org/10.1080/10447318.2019.1680921.
  • Schlattner et al. [2019] Philippe Schlattner, Pavol Bielik, and Martin T. Vechev. Learning to Infer User Interface Attributes from Images. ArXiv, 2019.
  • Chen et al. [2018] Chunyang Chen, Ting Su, Guozhu Meng, Zhenchang Xing, and Yang Liu. From ui design image to gui skeleton: A neural machine translator to bootstrap mobile gui implementation. In Proceedings of the 40th International Conference on Software Engineering, ICSE ’18, page 665–676, New York, NY, USA, 2018. Association for Computing Machinery. ISBN 9781450356381. doi: 10.1145/3180155.3180240. URL https://doi.org/10.1145/3180155.3180240.
  • Lee et al. [2020b] Chunggi Lee, Sanghoon Kim, Dongyun Han, Hongjun Yang, Youngwoo Park, Bum Chul Kwon, and Sungahn Ko. GUIComp: A GUI Design Assistant with Real-Time, Multi-Faceted Feedback. ArXiv, 2020b.
  • Villegas and Eberts [1994] Leticia Villegas and Ray E. Eberts. A neural network tool for identifying text-editing goals. International Journal of Human-Computer Studies, 40(5):813–833, May 1994. ISSN 1071-5819. doi: 10.1006/ijhc.1994.1039. URL http://www.sciencedirect.com/science/article/pii/S1071581984710391.
  • Kapoor et al. [2007] Ashish Kapoor, Winslow Burleson, and Rosalind W. Picard. Automatic prediction of frustration. International Journal of Human-Computer Studies, 65(8):724–736, August 2007. ISSN 1071-5819. doi: 10.1016/j.ijhcs.2007.02.003. URL http://www.sciencedirect.com/science/article/pii/S1071581907000377.
  • Vizer et al. [2009] Lisa M. Vizer, Lina Zhou, and Andrew Sears. Automated stress detection using keystroke and linguistic features: An exploratory study. International Journal of Human-Computer Studies, 67(10):870–886, October 2009. ISSN 1071-5819. doi: 10.1016/j.ijhcs.2009.07.005. URL http://www.sciencedirect.com/science/article/pii/S1071581909000937.
  • Van Tonder and Wesson [2012] Bradley Paul Van Tonder and Janet Louise Wesson. Improving the controllability of tilt interaction for mobile map-based applications. International Journal of Human-Computer Studies, 70(12):920–935, December 2012. ISSN 1071-5819. doi: 10.1016/j.ijhcs.2012.08.001. URL https://doi.org/10.1016/j.ijhcs.2012.08.001.
  • Lian et al. [2018] Zhouhui Lian, Bo Zhao, Xudong Chen, and Jianguo Xiao. EasyFont: A Style Learning-Based System to Easily Build Your Large-Scale Handwriting Fonts. ACM Transactions on Graphics, 38(1):6:1–6:18, December 2018. ISSN 0730-0301. doi: 10.1145/3213767. URL https://doi.org/10.1145/3213767.
  • Li et al. [2017] Bo-hu Li, Bao-cun Hou, Wen-tao Yu, Xiao-bing Lu, and Chun-wei Yang. Applications of artificial intelligence in intelligent manufacturing: a review. Frontiers of Information Technology & Electronic Engineering, 18(1):86–96, January 2017. ISSN 2095-9230. doi: 10.1631/FITEE.1601885. URL https://doi.org/10.1631/FITEE.1601885.
  • Liu et al. [2020] Hui Liu, Mingzhu Shen, Jiaqi Zhu, Nan Niu, Ge Li, and Lu Zhang. Deep Learning Based Program Generation from Requirements Text: Are We There Yet? IEEE Transactions on Software Engineering, pages 1–1, 2020. ISSN 1939-3520. doi: 10.1109/TSE.2020.3018481. Conference Name: IEEE Transactions on Software Engineering.
  • Ma et al. [2019] Suyu Ma, Zhenchang Xing, Chunyang Chen, Cheng Chen, Lizhen Qu, and Guoqiang Li. Easy-to-Deploy API Extraction by Multi-Level Feature Embedding and Transfer Learning. IEEE Transactions on Software Engineering, pages 1–1, 2019. ISSN 1939-3520. doi: 10.1109/TSE.2019.2946830. Conference Name: IEEE Transactions on Software Engineering.
  • Wenyin et al. [2001] Liu Wenyin, Wenjie Qian, Rong Xiao, and Xiangyu Jin. Smart Sketchpad-an on-line graphics recognition system. In Proceedings of Sixth International Conference on Document Analysis and Recognition, pages 1050–1054, September 2001. doi: 10.1109/ICDAR.2001.953946.
  • Zou et al. [2011] Yanming Zou, Yingfei Liu, Ying Liu, and Kongqiao Wang. Overlapped Handwriting Input on Mobile Phones. In 2011 International Conference on Document Analysis and Recognition, pages 369–373, September 2011. doi: 10.1109/ICDAR.2011.82. ISSN: 2379-2140.
  • Ahmed et al. [2019] Umair Z. Ahmed, Renuka Sindhgatta, Nisheeth Srivastava, and Amey Karkare. Targeted example generation for compilation errors. In Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, ASE ’19, pages 327–338, San Diego, California, November 2019. IEEE Press. ISBN 978-1-72812-508-4. doi: 10.1109/ASE.2019.00039. URL https://doi.org/10.1109/ASE.2019.00039.
  • Lavania et al. [2016] Chandrashekhar Lavania, Sunil Thulasidasan, Anthony LaMarca, Jeffrey Scofield, and Jeff Bilmes. A weakly supervised activity recognition framework for real-time synthetic biology laboratory assistance. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp ’16, pages 37–48, New York, NY, USA, September 2016. Association for Computing Machinery. ISBN 978-1-4503-4461-6. doi: 10.1145/2971648.2971716. URL https://doi.org/10.1145/2971648.2971716.
  • Wen et al. [2016] Elliott Wen, Winston Seah, Bryan Ng, Xuefeng Liu, and Jiannong Cao. UbiTouch: ubiquitous smartphone touchpads using built-in proximity and ambient light sensors. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp ’16, pages 286–297, New York, NY, USA, September 2016. Association for Computing Machinery. ISBN 978-1-4503-4461-6. doi: 10.1145/2971648.2971678. URL https://doi.org/10.1145/2971648.2971678.
  • Jungwirth et al. [2019] Florian Jungwirth, Michaela Murauer, Johannes Selymes, Michael Haslgrübler, Benedikt Gollan, and Alois Ferscha. mobEYEle: an embedded eye tracking platform for industrial assistance. In Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers, UbiComp/ISWC ’19 Adjunct, pages 1113–1119, New York, NY, USA, September 2019. Association for Computing Machinery. ISBN 978-1-4503-6869-8. doi: 10.1145/3341162.3350842. URL https://doi.org/10.1145/3341162.3350842.
  • Kim et al. [2019] Taeyong Kim, Sanghong Kim, Joonhee Choi, Youngsun Lee, and Bowon Lee. Say and Find it: A Multimodal Wearable Interface for People with Visual Impairment. In The Adjunct Publication of the 32nd Annual ACM Symposium on User Interface Software and Technology, UIST ’19, pages 27–29, New York, NY, USA, October 2019. Association for Computing Machinery. ISBN 978-1-4503-6817-9. doi: 10.1145/3332167.3357104. URL https://doi.org/10.1145/3332167.3357104.
  • Dou et al. [2019] Qi Dou, Xianjun Sam Zheng, Tongfang Sun, and Pheng-Ann Heng. Webthetics: Quantifying webpage aesthetics with deep learning. International Journal of Human-Computer Studies, 124:56–66, April 2019. ISSN 1071-5819. doi: 10.1016/j.ijhcs.2018.11.006. URL http://www.sciencedirect.com/science/article/pii/S1071581918306682.
  • Halter et al. [2019] Gaudenz Halter, Rafael Ballester‐Ripoll, Barbara Flueckiger, and Renato Pajarola. VIAN: A Visual Annotation Tool for Film Analysis. Computer Graphics Forum, 38(3):119–129, 2019. ISSN 1467-8659. doi: 10.1111/cgf.13676. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/cgf.13676. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.13676.
  • Bao et al. [2020] Lingfeng Bao, Zhenchang Xing, Xin Xia, David Lo, Minghui Wu, and Xiaohu Yang. psc2code: Denoising Code Extraction from Programming Screencasts. ACM Transactions on Software Engineering and Methodology, 29(3):21:1–21:38, June 2020. ISSN 1049-331X. doi: 10.1145/3392093. URL https://doi.org/10.1145/3392093.
  • Han et al. [2017] Xiaoguang Han, Chang Gao, and Yizhou Yu. DeepSketch2Face: a deep learning based sketching system for 3D face and caricature modeling. ACM Transactions on Graphics, 36(4):126:1–126:12, July 2017. ISSN 0730-0301. doi: 10.1145/3072959.3073629. URL https://doi.org/10.1145/3072959.3073629.
  • Bell and Bala [2015] Sean Bell and Kavita Bala. Learning visual similarity for product design with convolutional neural networks. ACM Transactions on Graphics, 34(4):98:1–98:10, July 2015. ISSN 0730-0301. doi: 10.1145/2766959. URL https://doi.org/10.1145/2766959.
  • Nishida et al. [2016] Gen Nishida, Ignacio Garcia-Dorado, Daniel G. Aliaga, Bedrich Benes, and Adrien Bousseau. Interactive sketching of urban procedural models. ACM Transactions on Graphics, 35(4):130:1–130:11, July 2016. ISSN 0730-0301. doi: 10.1145/2897824.2925951. URL https://doi.org/10.1145/2897824.2925951.
  • Shao et al. [2012] Tianjia Shao, Weiwei Xu, Kun Zhou, Jingdong Wang, Dongping Li, and Baining Guo. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Transactions on Graphics, 31(6):136:1–136:11, November 2012. ISSN 0730-0301. doi: 10.1145/2366145.2366155. URL https://doi.org/10.1145/2366145.2366155.
  • Bylinskii et al. [2017] Zoya Bylinskii, Nam Wook Kim, Peter O’Donovan, Sami Alsheikh, Spandan Madan, Hanspeter Pfister, Fredo Durand, Bryan Russell, and Aaron Hertzmann. Learning Visual Importance for Graphic Designs and Data Visualizations. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, UIST ’17, pages 57–69, New York, NY, USA, October 2017. Association for Computing Machinery. ISBN 978-1-4503-4981-9. doi: 10.1145/3126594.3126653. URL https://doi.org/10.1145/3126594.3126653.
  • Yeo et al. [2019] Hui-Shyong Yeo, Erwin Wu, Juyoung Lee, Aaron Quigley, and Hideki Koike. Opisthenar: Hand Poses and Finger Tapping Recognition by Observing Back of Hand Using Embedded Wrist Camera. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, UIST ’19, pages 963–971, New York, NY, USA, October 2019. Association for Computing Machinery. ISBN 978-1-4503-6816-2. doi: 10.1145/3332165.3347867. URL https://doi.org/10.1145/3332165.3347867.
  • Kong et al. [2016] Quan Kong, Takuya Maekawa, Taiki Miyanishi, and Takayuki Suyama. Selecting home appliances with smart glass based on contextual information. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp ’16, pages 97–108, New York, NY, USA, September 2016. Association for Computing Machinery. ISBN 978-1-4503-4461-6. doi: 10.1145/2971648.2971651. URL https://doi.org/10.1145/2971648.2971651.
  • Mairittha et al. [2020] Nattaya Mairittha, Tittaya Mairittha, and Sozo Inoue. Improving activity data collection with on-device personalization using fine-tuning. In Adjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers, UbiComp-ISWC ’20, pages 255–260, New York, NY, USA, September 2020. Association for Computing Machinery. ISBN 978-1-4503-8076-8. doi: 10.1145/3410530.3414370. URL https://doi.org/10.1145/3410530.3414370.
  • Stiehl et al. [2015] D. Stiehl, L. Addams, L. S. Oliveira, C. Guimarães, and A. S. Britto. Towards a SignWriting recognition system. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pages 26–30, August 2015. doi: 10.1109/ICDAR.2015.7333719.
  • Tensmeyer et al. [2017] Chris Tensmeyer, Daniel Saunders, and Tony Martinez. Convolutional Neural Networks for Font Classification. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), volume 01, pages 985–990, November 2017. doi: 10.1109/ICDAR.2017.164. ISSN: 2379-2140.
  • Wang et al. [2020] Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Wang, and Lei Zhang. Vision-Language Navigation Policy Learning and Adaptation. IEEE transactions on pattern analysis and machine intelligence, PP, February 2020. ISSN 1939-3539. doi: 10.1109/TPAMI.2020.2972281.
  • Fowkes et al. [2017] Jaroslav Fowkes, Pankajan Chanthirasegaran, Razvan Ranca, Miltiadis Allamanis, Mirella Lapata, and Charles Sutton. Autofolding for Source Code Summarization. IEEE Transactions on Software Engineering, 43(12):1095–1109, December 2017. ISSN 1939-3520. doi: 10.1109/TSE.2017.2664836. Conference Name: IEEE Transactions on Software Engineering.
  • Mohian and Csallner [2020] Soumik Mohian and Christoph Csallner. Doodle2App: native app code by freehand UI sketching. In Proceedings of the IEEE/ACM 7th International Conference on Mobile Software Engineering and Systems, MOBILESoft ’20, pages 81–84, New York, NY, USA, July 2020. Association for Computing Machinery. ISBN 978-1-4503-7959-5. doi: 10.1145/3387905.3388607. URL https://doi.org/10.1145/3387905.3388607.
  • Zhang et al. [2019a] Mingrui Ray Zhang, He Wen, and Jacob O. Wobbrock. Type, Then Correct: Intelligent Text Correction Techniques for Mobile Text Entry Using Neural Networks. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, UIST ’19, pages 843–855, New York, NY, USA, October 2019a. Association for Computing Machinery. ISBN 978-1-4503-6816-2. doi: 10.1145/3332165.3347924. URL https://doi.org/10.1145/3332165.3347924.
  • Huang and Canny [2019] Forrest Huang and John F. Canny. Sketchforme: Composing Sketched Scenes from Text Descriptions for Interactive Applications. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, UIST ’19, pages 209–220, New York, NY, USA, October 2019. Association for Computing Machinery. ISBN 978-1-4503-6816-2. doi: 10.1145/3332165.3347878. URL https://doi.org/10.1145/3332165.3347878.
  • Sun et al. [2018] Ke Sun, Chun Yu, Weinan Shi, Lan Liu, and Yuanchun Shi. Lip-Interact: Improving Mobile Device Interaction with Silent Speech Commands. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology, UIST ’18, pages 581–593, New York, NY, USA, October 2018. Association for Computing Machinery. ISBN 978-1-4503-5948-1. doi: 10.1145/3242587.3242599. URL https://doi.org/10.1145/3242587.3242599.
  • Zhou et al. [2019] Shufan Zhou, Beijun Shen, and Hao Zhong. Lancer: Your Code Tell Me What You Need. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 1202–1205, November 2019. doi: 10.1109/ASE.2019.00137. ISSN: 2643-1572.
  • Bhatt et al. [2019] Himanshu Sharad Bhatt, Shourya Roy, Lokesh Bhatnagar, Chetan Lohani, and Vinit Jain. Digital Auditor: A Framework for Matching Duplicate Invoices. In 2019 International Conference on Document Analysis and Recognition (ICDAR), pages 434–441, September 2019. doi: 10.1109/ICDAR.2019.00076. ISSN: 2379-2140.
  • Tufano et al. [2018] Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano di Penta, Martin White, and Denys Poshyvanyk. An Empirical Investigation into Learning Bug-Fixing Patches in the Wild via Neural Machine Translation. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 832–837, September 2018. doi: 10.1145/3238147.3240732. ISSN: 2643-1572.
  • Lekschas et al. [2020] Fritz Lekschas, Brant Peterson, Daniel Haehn, Eric Ma, Nils Gehlenborg, and Hanspeter Pfister. Peax: Interactive Visual Pattern Search in Sequential Data Using Unsupervised Deep Representation Learning. Computer Graphics Forum, 39(3):167–179, 2020. ISSN 1467-8659. doi: 10.1111/cgf.13971. URL https://onlinelibrary.wiley.com/doi/abs/10.1111/cgf.13971. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.13971.
  • Ge [2019] Xiaofei Ge. Android GUI search using hand-drawn sketches. In Proceedings of the 41st International Conference on Software Engineering: Companion Proceedings, ICSE ’19, pages 141–143, Montreal, Quebec, Canada, May 2019. IEEE Press. doi: 10.1109/ICSE-Companion.2019.00060. URL https://doi.org/10.1109/ICSE-Companion.2019.00060.
  • Zhang et al. [2018] Lvmin Zhang, Chengze Li, Tien-Tsin Wong, Yi Ji, and Chunping Liu. Two-stage sketch colorization. ACM Transactions on Graphics, 37(6):261:1–261:14, December 2018. ISSN 0730-0301. doi: 10.1145/3272127.3275090. URL https://doi.org/10.1145/3272127.3275090.
  • Sun et al. [2019] Lingyun Sun, Pei Chen, Wei Xiang, Peng Chen, Wei-yue Gao, and Ke-jun Zhang. SmartPaint: a co-creative drawing system based on generative adversarial networks. Frontiers of Information Technology & Electronic Engineering, 20(12):1644–1656, December 2019. ISSN 2095-9230. doi: 10.1631/FITEE.1900386. URL https://doi.org/10.1631/FITEE.1900386.
  • Zhang et al. [2019b] Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Qi Zhao, and Jiashi Feng. Anticipating Where People will Look Using Adversarial Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8):1783–1796, August 2019b. ISSN 1939-3539. doi: 10.1109/TPAMI.2018.2871688. Conference Name: IEEE Transactions on Pattern Analysis and Machine Intelligence.
  • Lee et al. [2019] Younghoon Lee, Sungzoon Cho, and Jinhae Choi. Smartphone help contents re-organization considering user specification via conditional GAN. International Journal of Human-Computer Studies, 129:108–115, September 2019. ISSN 1071-5819. doi: 10.1016/j.ijhcs.2019.04.002. URL http://www.sciencedirect.com/science/article/pii/S1071581918303677.
  • noa [c] Operating System Market Share Worldwide. c. URL https://gs.statcounter.com/os-market-share.
  • noa [d] How to Generate (Almost) Anything. d. URL http://howtogeneratealmostanything.com.
  • Qiao et al. [2019] Tingting Qiao, Jing Zhang, Duanqing Xu, and Dacheng Tao. MirrorGAN: Learning Text-To-Image Generation by Redescription. pages 1505–1514, 2019. URL https://openaccess.thecvf.com/content_CVPR_2019/html/Qiao_MirrorGAN_Learning_Text-To-Image_Generation_by_Redescription_CVPR_2019_paper.html.
  • noa [e] UI/UX design and collaboration tool Adobe XD. e. URL https://www.adobe.com/in/products/xd.html.
  • noa [f] InVision Digital product design, workflow & collaboration. f. URL https://www.invisionapp.com/.
  • noa [g] The digital design toolkit. g. URL https://www.sketch.com/.
  • noa [h] Meet Android Studio Android Developers. h. URL https://developer.android.com/studio/intro.
  • noa [i] Xcode. i. URL https://developer.apple.com/xcode/.
  • noa [j] Visual Studio IDE, Code Editor, Azure DevOps, & App Center. j. URL https://visualstudio.microsoft.com.