Online Bitrate Selection for Viewport Adaptive 360-Degree Video Streaming

Ming Tang and Vincent W.S. Wong Ming Tang and Vincent W.S. Wong are with the Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, Canada.
E-mail: {mingt, vincentw}@ece.ubc.ca

Abstract

360-degree video streaming provides users with immersive experience by letting users determine their field-of-views (FoVs) in real time. To enhance the users’ quality of experience (QoE) given their limited bandwidth, recent works have proposed a viewport adaptive 360-degree video streaming model by exploiting the bitrate adaptation in spatial and temporal domains. Under this video streaming model, in this paper, we consider a scenario with a newly generated 360-degree video without viewing history from other users. To maximize the user’s QoE, we propose an online bitrate selection algorithm, called OBS360. The proposed online algorithm can adapt to the unknown and heterogeneous users’ FoVs and downloading capacities. We prove that the proposed algorithm achieves sublinear dynamic regret under a convex decision set. This suggests that as the number of video segments increases, the performance of the online algorithm approaches the performance of the offline algorithm, where the users’ FoVs and downloading capacities are known. We perform simulations with real-world dataset to evaluate the performance of the proposed algorithm. Results show that compared with several existing methods, our proposed algorithm can enhance the users’ QoE significantly by improving the viewing bitrate and reducing the inter-segment and intra-segment degradation losses of the users.

Index Terms:

Adaptive 360-degree video streaming, virtual reality, quality of experience (QoE), online convex optimization, online gradient descent.

1 Introduction

1.1 Background and Motivation

With the development of virtual reality (VR) technologies, 360-degree video streaming is becoming increasingly popular. With such 360-degree videos, users can determine their field of views (FoVs) in real time by controlling the direction of the streaming devices, with which users can have immersive video streaming experience. Currently, there are various 360-degree video content providers (e.g., YouTube) and VR devices supporting 360-degree videos (e.g., Sony PlayStation VR[1]). An example of a 360-degree video streaming in a wireless network is shown in Fig. 1. The users watching the same 360-degree video may have heterogeneous FoVs (represented by the solid rectangles).

The 360-degree real-time interaction, however, is at the expense of additional bandwidth consumption. This is because all the 360-degree scenes, including the scenes that are being viewed or not being viewed by the users, have to be downloaded in real time in response to the users’ interactions. This additional bandwidth consumption imposes the requirement of efficient allocation of radio resources in wireless networks, so as to provide good quality of experience (QoE) video streaming services.

Refer to caption — Figure 1: 360-degree video streaming in a wireless network. Users A and B watch the same video, but they have heterogeneous FoVs.

To improve the user’s QoE, a promising approach is to use viewport adaptive 360-degree (VA360) video streaming [2, 3, 4], which exploits bitrate adaptation in both spatial and temporal domains. Specifically, in the encoding process, an entire video is divided into multiple segments, each of which corresponds to a video segment within a certain playback time period. Each segment is further spatially divided into multiple tiles. Each tile corresponds to the video of a viewing area during the segment playback time period. Each tile is encoded at multiple bitrates. During video streaming, video players can select the bitrate of each tile, adapting to human behaviors and real-time network connections. For example, in Fig. 2 (a), in the encoding process, each video is temporally divided into five segments. Each segment is spatially divided into $4\times 4=16$ tiles, each of which is encoded at two bitrate levels {Low, High}. In Fig. 2 (b), when streaming the video, the video player can select the bitrate of each tile to adapt to the variation of the user’s FoV (represented by the solid rectangles) so as to improve the user’s QoE and reduce the required downloading bandwidth. For example, the video player can select high bitrates for the tiles in the FoV and low bitrates for the others.

In this work, under the VA360 video streaming model, we focus on a video streaming scenario with a newly generated 360-degree video, where no historical viewing information from other users is available for predicting the viewing behaviors of a user. Designing a bitrate selection algorithm for this scenario is challenging. First, the user’s FoV for viewing a segment is unknown when the video player decides the bitrates of the segment, and the FoV may vary across segments. Second, the actual capacity for downloading a segment is unknown beforehand and may vary across time. These unknown and varying user’s FoV and downloading capacity require the bitrate selection algorithm to learn each user’s features in an online fashion and adapt the bitrates accordingly.

1.2 Related Work

Existing works have studied the bitrate selection algorithm design for VA360 video streaming. Using the statistics of other users’ viewing history, Xiao et al. in [5] proposed an online bitrate selection algorithm based on the available bandwidth and the probability that a tile is being viewed. Qian et al. in [6] proposed an algorithm based on the classification of the tiles of each segment using the viewing history. In [7], Zhou et al. proposed an algorithm to minimize the bandwidth utilization according to the user’s expected FoV. Jiang et al. in [8] proposed a two-layer bitrate selection algorithm based on both FoV prediction and buffer management. In [9], Xie et al. designed a bitrate selection algorithm using machine learning. In [10], Sun et al. proposed a two-tier system that selects high quality bitrates for tiles within the predicted FoV and low quality bitrates for the others. These works considered the bitrate selection according to the other users’ viewing history, which may not be suitable for those newly generated videos. In [11], Yuan et al. proposed to use a Gaussian model to predict the user’s FoV without using the other users’ viewing history. This method, however, requires the video player to define the potential FoV patterns of the user beforehand. Le Feuvre et al. in [12] proposed an algorithm that identifies the objects in each tile and decides the bitrates based on the identifying results. However, this algorithm may not be able to characterize the user heterogeneities. In [13], Shi et al. proposed a head movement-based approach, while its performance depends on the accuracy of movement prediction. Nguyen in [14] proposed an algorithm that adjusts the FoV estimation error to capture the user’s FoV preference. However, the proposed algorithm does not consider the bandwidth heterogeneity. Zhang et al. in [15] proposed a deep reinforcement learning (DRL) algorithm to learn the user’s FoV and bandwidth. Pang in [16] proposed a DRL algorithm to learn the user’s FoV and head movements. The learning in DRL algorithms requires random exploration. For a newly generated video without offline training, the video player learns during the streaming process, under which the randomly selected bitrates may lead to poor QoE.

Online convex optimization [17, 18] is a technique which can learn the users’ features in an online fashion. Typical methods include online gradient descent (OGD) algorithms [19, 20, 21] and online mirror descent (OMD) algorithms [22, 23]. One of the common characteristics of these algorithms is that the decision of an object (e.g., a segment in our work) is made based on the decision of the previous object and the realization of the parameters related to the previous object. Such algorithms cannot be directly used in the VA360 video streaming model for the following reason. The realization of the parameters related to a segment (e.g., the capacity for downloading the segment, the FoV for viewing the segment) is observed after the segment has been downloaded and viewed. When the video player needs to make the bitrate decision of a segment, however, the user may not have viewed the previous segment. In this case, the video player is unable to determine the bitrates of the segment by applying those existing algorithms [19, 20, 21, 22, 23].

1.3 Solution Approach and Contributions

In this work, we propose an online bitrate selection algorithm, called OBS360 algorithm. This algorithm aims to optimize the user’s QoE in VA360 video streaming. It can learn the user’s FoV preference and time-varying downloading capacity in real time, and can handle the uncertain FoV and downloading capacity of the user in the future.

The idea of the OBS360 algorithm is inspired by the existing OGD algorithms [19, 20, 21]. In contrast with the existing OGD algorithms, our proposed algorithm can handle the scenario in which the video player may not have observed the realization of the time-varying downloading capacity as well as the user’s FoV of the previous segment when it needs to make the bitrate decision of a segment. In addition, under the special case that the realization of the parameters related to a decision can always be observed before the next decision, the proposed algorithm is equivalent to the existing OGD algorithms.

The main contributions of this work are as follows.

•

Viewport-Adaptive 360-Degree Video Streaming: We focus on a 360-degree streaming scenario with a newly generated video, where the video does not have historical viewing information from other users. Under this scenario, we formulate a bitrate selection problem, which aims to optimize the user’s QoE.
•

Online Bitrate Selection Algorithm: We propose an OBS360 algorithm for online bitrate selection. The proposed algorithm can learn the user’s FoV preference and downloading capacity in real time. It can achieve sublinear dynamic regret under a convex decision set. This shows the intuition that as the number of segments increases, the performance of the online algorithm approaches the offline optimal performance where the user’s FoV and downloading capacity are known beforehand.
•

Performance Evaluation: Simulations with the real-world datasets from [24] and [25] show the following. The proposed OBS360 algorithm can significantly improve the user’s QoE when compared with BAS-360^∘ in [5] and Flare in [6]. Specifically, our proposed algorithm can improve the user’s viewing bitrate by $24.6\%-58.8\%$ . In addition, it can reduce the inter-segment and intra-segment degradation losses by $83.3\%-91.0\%$ and $84.1\%-89.1\%$ , respectively.

The rest of this paper is organized as follows. We present the system model in Section 2. In Section 3, we propose the online bitrate selection algorithm. In Section 4, we show the performance evaluation. We conclude in Section 5.

2 System Model

In this section, we first introduce the model setting, and then formulate the user’s QoE maximization problem.

2.1 Model Setting

We focus on the bitrate adaptation of a user’s 360-degree video streaming. The video and user models are as follows.

2.1.1 Video Model

The video is temporally partitioned into segments (i.e., small video pieces), each corresponding to a playback time of $\beta$ seconds. Let $\mathcal{I}=\{1,2,\ldots,I\}$ denote the set of segments. Each segment is further spatially divided into $K$ tiles with $M$ rows and $N$ columns, i.e., $K=MN$ . Let $\mathcal{K}=\{1,2,\ldots,K\}$ denote the set of tiles of each segment.

We introduce a reference FoV for each segment, where the idea is inspired by the recommended FoV in [25]. In practice, the reference FoV can be either the FoV showing the main object (determined by object detection methods, e.g., [12]) of the video scene or the recommended FoV marked by video producers (e.g., [25]). The reference FoV is used to characterize user-specific FoV preference, i.e., the relative position of a user’s FoV to the reference FoV. Fig. 3 shows an example of a 360-degree video streaming of car riding experience. The reference FoV can be looking ahead in the car (e.g., the four shaded tiles in the center) in Fig. 3, while a user may prefer to look toward his right-hand side (e.g., the FoV represented by the solid rectangle) in Fig. 3.

The reference FoV can be different in different segments. Hence, in order to characterize the user’s FoV preference (i.e., the relative position of the user’s FoV to the reference FoV), we define the indices of the tiles based on their relative positions to the reference FoV.¹¹1If using a fixed tile indexing (irrelevant to the reference FoV) as in most of the existing works in VA360 (e.g., [6, 5]), we cannot characterize user’s FoV preference because the reference FoV can vary across segments. This is one approach of tile indexing, while it does not affect either the video encoding or the video streaming process. The indices of the tiles are defined as follows. Suppose the top-left corner of the reference FoV is at the tile on row $m_{0}$ and column $n_{0}$ . The tile on row $m$ and column $n$ is indexed with $N\text{mod}(m-m_{0},M)+(\text{mod}(n-n_{0},N)+1)$ , where $\text{mod}(x,y)$ is equal to $x$ modulo $y$ , and $M$ and $N$ is the total number of rows and columns of the tiles, respectively. For example, in Fig. 4, the reference FoV is represented by the shaded area, which can be different for different segments. In segment $1$ , the top-left corner of the reference FoV is located at the tile on row $m_{0}=2$ and column $n_{0}=2$ , so the tile on row $m=2$ and column $n=4$ is indexed by $4\text{mod}(2-2,4)+(\text{mod}(4-2,4)+1)=3$ . Note that the 360-degree video has no boundary, e.g., in segment $1$ , tile $4$ is adjacent to the right-hand side of tile $3$ .

In the following, we use tile $(i,k)$ to denote tile $k$ of segment $i$ . Each tile is encoded at $|\mathcal{R}|$ bitrate levels, where each level is selected from bitrate set $\mathcal{R}=\{R_{1},R_{2},...,R_{|\mathcal{R}|}\}$ . The value of the bitrate represents the number of bits required to encode a tile corresponding to a playback time period of one second. Hence, the size (i.e., the total number of bits) of a tile of a $\beta$ -second segment selecting bitrate $r\in\mathcal{R}$ is $r\beta$ . Without loss of generality, we assume that $R_{1}<R_{2}<\cdots<R_{|\mathcal{R}|}$ .

2.1.2 User Model

During video streaming, the user’s downloading capacity and his FoV are time-varying. The user’s downloading capacity varies across time. We consider a continuous time interval $\mathcal{T}=[0,T)$ . Let $d(t)$ denote the user’s downloading capacity at time $t\in\mathcal{T}$ , i.e., the user’s maximum achievable downloading rate at time $t$ . We assume that $d(t)$ is upper- and lower-bounded, i.e., $d(t)\in[d^{\text{min}},d^{\text{max}}]$ for all $t\in\mathcal{T}$ . On the other hand, the user’s FoV varies across playback time (instead of the real time),²²2For example, a user starts playing the video at time $t=0$ sec, and the video rebuffers for one second during time interval $[1,2]$ (sec). Then, at real time $t=3$ sec, the playback time is at $3-(2-1)=2$ sec. because the user changes his FoV to watch his interested contents in the video, which is playback time-associated. We assume that when the user watches a segment, the user’s FoV is unchanged. This is reasonable because in practice, a segment always corresponds to a video piece of several seconds.

2.2 Problem Formulation

We aim to optimize the bitrate selection of a user to maximize the user’s QoE. In the following, we first introduce the decision variables. We then describe the constraints and the QoE. Finally, we formulate the QoE maximization problem.

2.2.1 Decision Variables

The user makes decisions on the bitrates of the tiles. Let $r_{i,k}\in\mathcal{R}$ denote the bitrate decision of tile $(i,k)$ . Let $\tau_{i,k}\in\mathcal{T}$ and $\hat{\tau}_{i,k}\in\mathcal{T}$ denote the time that the downloading of tile $(i,k)$ is started and finished, respectively. The user downloads the video tiles in a particular sequence: tile $(i,k)$ is downloaded earlier than tile $(i^{\prime},k^{\prime})$ if and only if either (a) $i<i^{\prime}$ or (b) $i=i^{\prime}$ and $k<k^{\prime}$ . When one tile has been downloaded, the downloading of the next tile will start immediately, i.e.,

{\tau}_{i,k}=\left\{\begin{array}[]{ll}\hat{\tau}_{i,k-1},&k>1,\\ \hat{\tau}_{i-1,K},&k=1.\end{array}\right.

(1)

Let $b_{i}$ denote the buffer occupancy when all the tiles of segment $i$ have been downloaded. We define the buffer occupancy with respect to the segment index for the presentation simplicity of buffer update (i.e., the buffer is being updated when a segment has been received). The buffer occupancy is in the unit of playback time, which is commonly used in the existing literature on dynamic adaptive streaming over HTTP (DASH), e.g., [26]. Receiving one segment will lead to a buffer occupancy increased by $\beta$ seconds.

Without loss of generality, we set $b_{0}={b}^{\textsc{ini}}$ as the initial buffer occupancy, and $\hat{\tau}_{0,K}={\tau}_{1,1}=0$ as the initial starting time. We denote the following decision vectors: $\bm{r}=(r_{i,k},i\in\mathcal{I},k\in\mathcal{K})$ , $\bm{r}_{i}=(r_{i,k},k\in\mathcal{K})$ , $\bm{\tau}=(\tau_{i,k},i\in\mathcal{I},k\in\mathcal{K})$ , $\hat{\bm{\tau}}=(\hat{\tau}_{i,k},i\in\mathcal{I},k\in\mathcal{K})$ , and $\bm{b}=(b_{i},i\in\mathcal{I})$ .

2.2.2 Downloading Capacity and Buffer Update Constraints

Within the downloading period of tile $(i,k)$ , the capacity constraint ensures that the total size of the downloaded tile should be no larger than the downloading capacity within the downloading period, i.e.,

r_{i,k}\beta\leq\int_{{\tau}_{i,k}}^{\hat{\tau}_{i,k}}d(t)dt,~{}i\in\mathcal{I},k\in\mathcal{K}.

(2)

When all the tiles of segment $i$ have been received, the buffer is updated as follows:

b_{i}=\left[b_{i-1}-\left(\hat{\tau}_{i,K}-{\tau}_{i,1}\right)\right]^{+}+\beta,~{}i\in\mathcal{I},

(3)

where the operator $[x]^{+}=\max\{x,0\}$ . In (3), if the buffer occupancy $b_{i-1}$ is no smaller than the downloading period, then buffer occupancy $b_{i}$ is the sum of the buffer occupancy immediately before receiving segment $i$ and the received segment length $\beta$ . If the buffer becomes empty before receiving segment $i$ , then buffer occupancy $b_{i}$ is equal to $\beta$ .

2.2.3 User’s QoE

For the definition of the user’s QoE, let $\omega_{i,k}$ denote the fraction of tile $(i,k)$ that is overlapped with the user’s FoV. We define a function $\mu_{i}(\bm{r}_{i})$ for any segment decision vector $\bm{r}_{i}$ , indicating the viewing bitrate when the user views segment $i$ under the bitrate decision, i.e.,

\mu_{i}(\bm{r}_{i})=\sum_{k\in\mathcal{K}}\omega_{i,k}r_{i,k},~{}i\in\mathcal{I}.

(4)

Note that this is the bitrate that the user actually views, taking into account the user’s FoV.

Similar to some of the existing works [6, 15], the user’s QoE consists of three terms, which are the user’s viewing utility $U(\bm{r})$ , the rebuffering loss $L^{\textsc{rb}}(\bm{\tau},\hat{\bm{\tau}},\bm{b})$ , and the bitrate degradation loss $L^{\textsc{bd}}(\bm{r})$ , i.e.,

Q(\bm{r},\bm{\tau},\hat{\bm{\tau}},\bm{b})=U(\bm{r})-L^{\textsc{rb}}(\bm{\tau},\hat{\bm{\tau}},\bm{b})-L^{\textsc{bd}}(\bm{r}).

(5)

The user’s viewing utility of each segment is a function of the user’s viewing bitrate of the segment, i.e., a larger viewing bitrate leads to a higher utility. The user’s viewing utility $U(\bm{r})$ is the summation of the user’s viewing utilities of all segments:

U(\bm{r})=\sum_{i\in\mathcal{I}}u_{i}(\mu_{i}(\bm{r}_{i})),

(6)

where $u_{i}(\cdot)$ for segment $i$ is a nondecreasing concave function. This is a generalization of those works considering linear viewing utility, e.g., [6, 5, 15].

If the tiles of a segment have not been completely received by the video player when the segment is to be played, then rebuffering will occur, and the video will freeze until the tiles of the segment are received. The rebuffering loss is the user’s loss resulting from the video freeze. This loss is proportional to the rebuffering time. Hence, the rebuffering loss is defined as follows:

L^{\textsc{rb}}(\bm{\tau},\hat{\bm{\tau}},\bm{b})=l^{\textsc{rb}}\sum_{i\in\mathcal{I}}\left[\hat{\tau}_{i,K}-\tau_{i,1}-b_{i-1}\right]^{+},

(7)

where $l^{\textsc{rb}}$ is the unit loss if the user experiences a one-second rebuffering. In (7), if the downloading duration of segment $i$ is larger than the buffer occupancy $b_{i-1}$ , then rebuffering will happen, leading to a rebuffering time of $\left[\hat{\tau}_{i,K}-\tau_{i,1}-b_{i-1}\right]^{+}$ .

The bitrate degradation loss consists of two parts: an inter-segment bitrate degradation loss $L^{\textsc{bd-e}}(\bm{r})$ and an intra-segment bitrate degradation loss $L^{\textsc{bd-a}}(\bm{r})$ . That is,

L^{\textsc{bd}}(\bm{r})=L^{\textsc{bd-e}}(\bm{r})+L^{\textsc{bd-a}}(\bm{r}).

(8)

The inter-segment bitrate degradation loss is the loss resulting from the bitrate degradation among segments, i.e.,

L^{\textsc{bd-e}}(\bm{r})=l^{\textsc{bd-e}}\sum_{i\in\mathcal{I}/\{1\}}\left[\mu_{i-1}(\bm{r}_{i-1})-\mu_{i}(\bm{r}_{i})\right]^{+},

(9)

where $l^{\textsc{bd-e}}$ is the loss if the viewing bitrate is degraded by one unit. In (9), if $\mu_{i-1}(\bm{r}_{i-1})$ is larger than $\mu_{i}(\bm{r}_{i})$ , then the inter-segment bitrate degradation occurs, leading to a degradation loss which is proportional to the size of the degradation. The intra-segment bitrate degradation loss is the loss resulting from the bitrate difference among the tiles of each segment, i.e.,

L^{\textsc{bd-a}}(\bm{r})=l^{\textsc{bd-a}}\sum_{i\in\mathcal{I}}\sum_{k\in\mathcal{K}}\omega_{i,k}\left[\frac{\mu_{i}(\bm{r}_{i})}{\sum_{k^{\prime}\in\mathcal{K}}\omega_{i,k^{\prime}}}-r_{i,k}\right]^{+},

(10)

where $l^{\textsc{bd-a}}$ is the loss if the bitrate is degraded by one unit. In (10), if $\omega_{i,k}$ is positive (i.e., the tile is being viewed), and the bitrate $r_{i,k}$ is smaller than the normalized viewing bitrate $\mu_{i}(\bm{r}_{i})/(\sum_{k^{\prime}\in\mathcal{K}}\omega_{i,k^{\prime}})$ , then the intra-segment bitrate degradation happens, inducing a degradation loss proportional to the size of the degradation.

2.2.4 Problem Formulation

We aim to determine the decision vectors $\bm{r}$ , $\bm{\tau}$ , $\hat{\bm{\tau}}$ , $\bm{b}$ to maximize the user’s QoE subject to the capacity and buffer update constraints. The problem is formulated as follows:


$\displaystyle\displaystyle\mathop{\textrm{missing}}{maximize}\limits_{\bm{r},\bm{\tau},\hat{\bm{\tau}},\bm{b}}$	$\displaystyle~{}~{}Q(\bm{r},\hat{\bm{\tau}},\bm{\tau},\bm{b})$	(11a)
subject to	$\displaystyle~{}~{}r_{i,k}\in\mathcal{R},~{}i\in\mathcal{I},k\in\mathcal{K},$	(11b)
	$\displaystyle~{}~{}0\leq\tau_{i,k}<T,~{}i\in\mathcal{I},k\in\mathcal{K},$	(11c)
	$\displaystyle~{}~{}0\leq\hat{\tau}_{i,k}<T,~{}i\in\mathcal{I},k\in\mathcal{K},$	(11d)
	$\displaystyle~{}~{}b_{i}\geq 0,~{}i\in\mathcal{I},$	(11e)
	$\displaystyle~{}~{}\text{constraints }\eqref{eq:download},\eqref{eq:cap-constraint},\eqref{eq:buffer-constraint}.$

Problem (11) is a mixed-integer nonconvex programming problem with nonlinear constraints (2) and (3). This problem is challenging to solve, even in an offline case when all the parameters (i.e., the user’s FoV and downloading capacity) are known beforehand.

In this work, we focus on the online algorithm design, which addresses the realistic scenario where the user’s FoV of a segment and the capacity for downloading the segment are unknown when the bitrate decision of the segment needs to be made.

3 Online Algorithm Design

In this section, we focus on the online algorithm design. For the design of the online algorithm, we consider a set of per-segment problems, each corresponding to one of the segments of the video. The bitrate decision of a segment will be made based on two kinds of bitrate decisions of a set of previous segments. The first is the actual bitrate decisions of the set of previous segments, which are the decisions made according to the proposed algorithm. The second is the optimal bitrate decisions of the set of previous segments, which are the decisions obtained by solving the corresponding per-segment problems according to the realizations of the real-time downloading capacities and FoVs of the segments. The performance of the algorithm will be characterized by dynamic regret[20, 19], reflecting the regret of the algorithm in the objective value. We show that our online algorithm yields sublinear dynamic regret, i.e., as the number of segments increases, the dynamic regret of the online algorithm approaches zero on the long-term average.

We first introduce the set of per-segment problems and the performance metric. Then, we introduce the online algorithm. After that, we show the performance guarantee of the proposed algorithm under particular conditions. Finally, we modify the online algorithm to address the scenario when the conditions are not satisfied in practice.

3.1 Per-Segment Problem

For the design of the online algorithm, we consider a set of per-segment optimization problems. The per-segment problem corresponding to segment $i\in\mathcal{I}$ aims to optimize the bitrate decision of segment $i$ at the time that the bitrate decisions of all the previous segments have been made.

3.1.1 Per-Segment Objective Function

For segment $i\in\mathcal{I}$ , we define a function $\widetilde{Q}_{i}(\bm{r}_{i}~{}|~{}\bm{r}_{i-1},b_{i-1})$ , which indicates the user’s QoE of segment $i$ under bitrate decision vector $\bm{r}_{i}$ , given the bitrate decision vector of segment $i-1$ (i.e., $\bm{r}_{i-1}$ ) and the buffer occupancy before downloading segment $i$ (i.e., $b_{i-1}$ ). The function $\widetilde{Q}_{i}(\bm{r}_{i}~{}|~{}\bm{r}_{i-1},b_{i-1})$ is given as follows:

\widetilde{Q}_{i}(\bm{r}_{i}~{}|~{}\bm{r}_{i-1},b_{i-1})\\ =u_{i}(\mu_{i}(\bm{r}_{i}))-l^{\textsc{rb}}\left(\frac{1}{\bar{d}_{i}}\sum_{k\in\mathcal{K}}r_{i,k}\beta-b_{i-1}\right)\\ -l^{\textsc{bd-a}}\sum_{k\in\mathcal{K}}\omega_{i,k}\left[\frac{\mu_{i}(\bm{r}_{i})}{\sum_{k^{\prime}\in\mathcal{K}}\omega_{i,k^{\prime}}}-r_{i,k}\right]^{+}\\ -l^{\textsc{bd-e}}\left(\mu_{i-1}(\bm{r}_{i-1})-\mu_{i}(\bm{r}_{i})\right),

(12)

where $\bar{d}_{i}$ is the average downloading capacity when segment $i$ is being downloaded. For simplification, we will use $\widetilde{Q}_{i}(\bm{r}_{i})$ to denote $\widetilde{Q}_{i}(\bm{r}_{i}~{}|~{}\bm{r}_{i-1},b_{i-1})$ in the rest of this paper.

In contrast to the user’s QoE for segment $i$ (as in Section 2.2.3), equation (12) has two major differences. First, the downloading time of segment $i$ in (12), i.e., ${\sum_{k\in\mathcal{K}}r_{i,k}\beta}/\bar{d}_{i}$ , is an approximation of the actual downloading time (as it is computed in (2)). This approximation is introduced to simplify the algorithm design, and is close to the actual downloading time, because the change of the downloading capacity within the downloading period of a segment is small in practice.³³3For example, downloading a two-second 4K segment (which has a bitrate of around $2\times 50$ Mbps on YouTube) using a 62 Mbps downlink (as the average mobile download speed in Canada in 2019 [27]) leads to a downloading period of $2\times 50/62\approx 1.6$ seconds. Second, in (12), we relax the operators $[\cdot]^{+}$ involved in the rebuffering loss and the inter-segment bitrate degradation loss. Intuitively, as we focus on per-segment problems, this relaxation can help with characterizing the impact of the bitrate decision of a segment on the user’s QoE of the subsequent segments. For example, the rebuffering loss may fail to capture the difference between downloading a segment (with smaller bitrates) for one second and downloading a segment (with larger bitrates) for five seconds, if neither downloading process induces rebuffering. The two downloading processes, however, may lead to different likelihood of having a rebuffering later due to the resulting different buffer occupancies after the downloading has been accomplished. This difference can be characterized if the operator $[\cdot]^{+}$ is relaxed. A similar reason applies for the inter-segment bitrate degradation loss.

3.1.2 Per-Segment Optimization Problem

Given $\bm{r}_{i-1}$ and $b_{i-1}$ , the per-segment problem for segment $i$ is to determine the bitrate decision $\bm{r}_{i}$ by maximizing $\widetilde{Q}_{i}(\bm{r}_{i})$ , i.e.,

\begin{array}[]{rl}\mathop{\textrm{missing}}{maximize}\limits_{\bm{r}_{i}}&~{}~{}\widetilde{Q}_{i}(\bm{r}_{i})\\ \textrm{subject to}&~{}~{}r_{i,k}\in{\mathcal{R}},~{}k\in\mathcal{K}.\end{array}

(OPT-SEGMENT-i)

Problem (OPT-SEGMENT-i) does not include constraints (1), (2), and (3) (as in problem (11)) for the following reasons. Constraint (1) is eliminated, because under the per-segment problem, the relationship between the downloading time of different segments does not need to be considered. Constraint (2) is eliminated due to the downloading time approximation. In addition, with the set of per-segment problems, at the time that the bitrate decision of segment $i$ is optimized, the buffer occupancy $b_{i-1}$ is directly observed, so the buffer update constraint (3) is not required.

3.2 Performance Metrics

Let $\bm{r}_{i}^{o}$ denote the bitrate decision of segment $i$ resulting from the online algorithm. We evaluate the performance of the online algorithm using dynamic regret [20, 19]. The dynamic regret is defined as the difference between the user’s QoE under the optimal bitrate selection decision (i.e., the decision after the realization of the user’s downloading capacity and FoV) and the user’s QoE under the actual bitrate selection decision determined by the proposed algorithm (i.e., the decision before the realization of the user’s downloading capacity and FoV). The definition of the dynamic regret is as follows:

\text{Reg}_{I}\triangleq\sum_{i\in\mathcal{I}}\left(\widetilde{Q}_{i}(\bm{r}_{i}^{*})-\widetilde{Q}_{i}(\bm{r}_{i}^{o})\right),

(13)

where $\bm{r}_{i}^{*}$ is the optimal bitrate decision for segment $i$ obtained by solving problem (OPT-SEGMENT-i). We aim to design an online algorithm that achieves sublinear dynamic regret, i.e.,

\lim_{I\rightarrow\infty}\text{Reg}_{I}/I=0,

(14)

which implies that as the number of segments approaches infinity, the dynamic regret of the online algorithm on the long-term average (i.e., $\text{Reg}_{I}/I$ ) approaches zero.

3.3 Online Bitrate Selection Algorithm

We design an online algorithm, called OBS360 algorithm. The online algorithm design is inspired by the OGD algorithms, e.g., [19, 20, 21]. Those existing algorithms [19, 20, 21], however, cannot be directly applied in VA360 video streaming service. This is because for those algorithms, the decision of an object (e.g., a segment) is made based on the decision of the previous object and the realization of the parameters related to that object. In VA360 video streaming, however, the parameter realization of the previous object may not have been observed when the decision of an object is to be made. To address this, in our proposed OBS360 algorithm, a decision is made based on the decisions of a set of objects whose parameter realizations have been observed and the corresponding parameter realizations of those objects. Such an algorithm design is more challenging than those existing ones, because the decision making of an object should take into account the parameter realizations and decisions of a set of objects as well as the parameter realization correlations among the objects. Under a special case that at the time when a decision is made, the parameter realization of the previous decision can always be obtained, our proposed algorithm is equivalent to the existing OGD algorithms [19, 20, 21].

In the following, we first introduce an auxiliary set for each segment, which indicates the segments whose parameter realizations (e.g., capacity for downloading the segment, FoV for viewing the segment) are observed after the bitrate decision of the previous segment is made. Then, we present our proposed OBS360 algorithm.

3.3.1 Auxiliary Set

Let $\tilde{I}_{i}$ denote the segment index such that segment $i$ has been viewed by the user during the time period when the bitrate decisions of segments $\tilde{I}_{i}$ and $\tilde{I}_{i}+1$ are made. Since a segment should be downloaded before it is being viewed, the FoV and downloading capacity information of segment $i$ has been observed by the video player when the bitrate decision of segment $\tilde{I}_{i}+1$ is to be made. The auxiliary set $\mathcal{I}_{i}\subset\mathcal{I}$ for segment $i\in\mathcal{I}\cup\{I+1\}$ is defined as follows:

\mathcal{I}_{i}\triangleq\{i^{\prime}\in\mathcal{I}~{}|~{}\tilde{I}_{i^{\prime}}=i-1\},

(15)

which is the set of segments which have been viewed by the user during the time period when the bitrate decisions of segments $i-1$ and $i$ are made. The consideration of $i=I+1$ is used to include the segments whose parameter realizations are observed after the bitrate decision of segment $I$ is made.⁴⁴4Note that given the bitrate decisions and the parameter realizations of all segments in $\mathcal{I}$ , the vectors $\mathcal{I}_{I+1}$ and $\bm{r}^{o}_{I+1}$ can be determined.

3.3.2 OBS360 Algorithm

The OBS360 algorithm is given in Algorithm 1. The algorithm first initializes the following parameters: an initial bitrate vector $\bm{r}_{0}$ , which can be any bitrate in set $\mathcal{R}^{K}$ ; a parameter $\alpha>0$ , which corresponds to the positive stepsize in the existing OGD algorithms [19, 20, 21].

Algorithm 1 OBS360 Algorithm

1: Initialization:

\bm{r}_{0}

and

\alpha

;

2: for all

i\in\mathcal{I}

3: Obtain

\mathcal{I}_{i}

according to (15);

4: if

\mathcal{I}_{i}\neq\emptyset

then

5: Compute

J_{i}

according to (17);

6: Update

\bm{r}^{o}_{i}

according to (16);

7: else

8: Update

\bm{r}^{o}_{i}

according to (18);

9: end if

10: end for

For any segment $i$ , the video player first obtains the auxiliary set $\mathcal{I}_{i}$ , which contains the indices of the segments whose FoV and downloading capacity information is newly observed as defined in (15). If the set $\mathcal{I}_{i}$ is non-empty, then the bitrate decision of segment $i$ will be made based on the bitrate decisions of the segments referred in set $\mathcal{I}_{i}$ and the corresponding downloading capacity and FoV information when each of those segments is downloaded and viewed, respectively. Specifically, the bitrate decision of segment $i$ is updated as follows:

\bm{r}_{i}^{o}=\arg\min_{\bm{r}_{i}\in{\mathcal{R}}^{K}}-\nabla\widetilde{Q}_{J_{i}}(\bm{r}_{J_{i}}^{o})^{\top}(\bm{r}_{i}-\bm{r}_{J_{i}}^{o})+\frac{1}{2\alpha}\lVert\bm{r}_{i}-\bm{r}_{J_{i}}^{o}\rVert^{2},

(16)

where $\nabla\widetilde{Q}_{J_{i}}(\bm{r}_{J_{i}}^{o})$ is the subgradient of $\widetilde{Q}_{J_{i}}(\bm{r}_{J_{i}}^{o})$ , and $J_{i}$ is defined as

J_{i}\triangleq\arg\min_{j\in\mathcal{I}_{i}}-\nabla\widetilde{Q}_{j}(\bm{r}_{j}^{o})^{\top}(\bm{r}_{j}^{*}-\bm{r}_{j}^{o}).

(17)

Specifically, $J_{i}$ is the segment index in set $\mathcal{I}_{i}$ such that the dot product of the negative value of the subgradient $\nabla\widetilde{Q}_{j}(\bm{r}_{j}^{o})$ and the difference between the optimal decision of segment $J_{i}$ (obtained by solving the corresponding per-segment problem) and the actual decision of segment $J_{i}$ (obtained according to the algorithm) is the smallest, comparing with other segments in set $\mathcal{I}_{i}$ . If the set $\mathcal{I}_{i}$ is empty, then the bitrate decision of segment $i$ is the same as the bitrate decision of segment $i-1$ :

\bm{r}_{i}^{o}=\bm{r}_{i-1}^{o}.

(18)

In the special case when set $\mathcal{I}_{i}=\{i-1\}$ for all $i\in\mathcal{I}\cup\{I+1\}$ , i.e., the information of segment $i-1$ can always be observed during the time period when the decisions of segments $i-1$ and $i$ are made, Algorithm 1 is equivalent to the existing OGD algorithms in [19, 20, 21].

3.4 Performance Analysis

We proceed to show that Algorithm 1 leads to sublinear dynamic regret. Note that the analysis in this section is based on a case where the bitrate set $\mathcal{R}$ is convex. This is commonly assumed when the regret is derived in the existing online convex optimization algorithms [17, 18, 19, 20, 21, 22, 23]. Hence, our result on sublinear dynamic regret reveals the performance of the proposed algorithm under the particular case, and it provides an insight that as the number of segments increases, the performance of the proposed online algorithm approaches the offline optimal solution. The performance of the proposed algorithm over a practical bitrate set is evaluated using simulations in Section 4.

In the following, we first show a condition regarding the variation of the time-varying parameters, and then present the bound of the dynamic regret of Algorithm 1. Finally, we formally state the sublinear dynamic regret.

3.4.1 Condition on Parameter Variation

In online convex optimization techniques considering time-varying parameters, without restricting the varying of the parameters, obtaining a bound on dynamic regret is not possible [23]. Here, we impose a condition on the varying of the parameters as follows:

Condition 1 (Time-Varying Parameters).

The variation of the time-varying parameters, i.e., downloading capacity $(d(t),t\in\mathcal{T})$ and FoV $(\bm{\omega}_{i},i\in\mathcal{I})$ , should be small in the sense that there exist nonnegative values $V_{\emptyset}$ and $V_{r}$ such that the following inequalities hold for any number of segments $I$ :

•

The number of the set $\mathcal{I}_{i}$ that satisfies $\mathcal{I}_{i}=\emptyset$ for all $i\in\mathcal{I}\cup\{I+1\}$ is bounded, i.e.,

$\sum_{i\in\mathcal{I}\cup\{I+1\}}\mathbbm{1}(\mathcal{I}_{i}=\emptyset)\leq V_{\emptyset},$ (19)

where $\mathbbm{1}(\cdot)$ is an indicator function, i.e., $\mathbbm{1}(\mathcal{I}_{i}=\emptyset)=1$ if $\mathcal{I}_{i}=\emptyset$ , and is equal to zero otherwise.
•

Let $J_{i}^{{\dagger}}$ denote the index of the segment such that the bitrates of segment $J_{i}$ are made based on it, i.e., $J_{i}^{{\dagger}}=J_{i^{\prime}}$ with $i^{\prime}=J_{i}$ .⁵⁵5If $\mathcal{I}_{i}=\emptyset$ , we define $J_{i}\triangleq J_{h(i)}$ , where $h(i)=\{h~{}|~{}\mathcal{I}_{h}\neq\emptyset,\mathcal{I}_{h+1}=\mathcal{I}_{h+2}=\cdots=\mathcal{I}_{i}=\emptyset\}$ is the index of the segment which is the last segment before segment $i$ (i.e., $h(i)<i$ ) whose $\mathcal{I}_{h(i)}\neq\emptyset$ . The difference between the optimal solution to per-segment problem ${J_{i}}$ and that to per-segment problem ${J_{i}^{{\dagger}}}$ is small for all $i\in\mathcal{I}\cup\{I+1\}$ such that

$\sum_{i\in\mathcal{I}\cup\{I+1\}}|\mathcal{I}_{i}|\lVert\bm{r}^{*}_{J_{i}}-\bm{r}^{*}_{J_{i}^{{\dagger}}}\rVert\leq V_{r},$ (20)

where $|\mathcal{I}_{i}|$ is the cardinality of set $\mathcal{I}_{i}$ .

In Condition 1, inequality (19) implies that the number of the bitrate decisions that are not updated based on (16) is bounded. Inequality (20) ensures that the variations of the parameters are relatively small, such that the changes among the optimal solutions to the per-segment problems are bounded. In the special case when $\mathcal{I}_{i}=\{i-1\}$ for all $i\in\mathcal{I}\cup\{I+1\}$ , Condition 1 is equivalent to each of the parameter variation conditions in the existing works [20, 19] on OGD algorithms.

3.4.2 Bound of the Dynamic Regret

In the following, we first show the bound of the regret of each segment under Algorithm 1. We then show the bound of the dynamic regret.

The bound of the regret of each segment is given in Lemma 1. Note that in this lemma, we use two different subscripts $s$ and $i$ to refer to the indices of different segments in order to make the presentation clear.

Lemma 1 (Regret of Each Segment).

Under Condition 1 and convex decision set $\mathcal{R}$ , the regret of any segment $s\in\mathcal{I}$ , i.e., $\widetilde{Q}_{s}(\bm{r}^{*}_{s})-\widetilde{Q}_{s}(\bm{r}_{s}^{o})$ , is upper-bounded as follows:

(A)

For any segment $s\in\mathcal{I}_{i}$ , $i\in\mathcal{I}\cup\{I+1\}$ ,

\widetilde{Q}_{s}(\bm{r}^{*}_{s})-\widetilde{Q}_{s}(\bm{r}_{s}^{o})\leq\frac{1}{2\alpha}\Big{(}2R\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert\\ +\lVert\bm{r}_{J_{i}}^{o}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}\Big{)}+\frac{\alpha}{2}\overline{Q}^{2}.

(21)

(B)

For any segment $s\in\widetilde{\mathcal{I}}\triangleq\{s^{\prime}\in\mathcal{I}~{}|~{}s^{\prime}\notin\mathcal{I}_{i},i\in\mathcal{I}\cup\{I+1\}\}$ ,

$\widetilde{Q}_{s}(\bm{r}^{*}_{s})-\widetilde{Q}_{s}(\bm{r}_{s}^{o})\leq\frac{3R^{2}}{2\alpha}+\frac{\alpha}{2}\overline{Q}^{2}.$ (22)

The constant $\overline{Q}$ is the bound of the norm of the subgradient $\nabla\widetilde{Q}_{i}(\bm{r}_{i})$ , i.e., $\lVert\nabla\widetilde{Q}_{i}(\bm{r}_{i})\rVert\leq\overline{Q}$ for all $i\in\mathcal{I}$ and $\bm{r}_{i}\in{\mathcal{R}}^{K}$ . The constant $R$ is the radius of the convex set ${\mathcal{R}}^{K}$ , i.e., $\lVert\bm{r}_{i}-\bm{r}_{j}\rVert\leq R$ , for all $\bm{r}_{i},\bm{r}_{j}\in{\mathcal{R}}^{K}$ .

Proof.

For the proof of the statements (A) and (B) in Lemma 1, we first show an inequality that holds for any segment $i\in\mathcal{I}\cup\{I+1\}$ whose $\mathcal{I}_{i}\neq\emptyset$ . Specifically, for any of these segments, the bitrate decision $\bm{r}_{i}^{o}$ is the optimal solution of the following optimization problem (according to (16)):

\min_{\bm{r}_{i}\in{\mathcal{R}}^{K}}f_{i}(\bm{r}_{i})\triangleq-\nabla\widetilde{Q}_{J_{i}}(\bm{r}_{J_{i}}^{o})^{\top}(\bm{r}_{i}-\bm{r}_{J_{i}}^{o})+\frac{1}{2\alpha}\lVert\bm{r}_{i}-\bm{r}_{J_{i}}^{o}\rVert^{2}.

(23)

It can be shown that function $f_{i}(\bm{r}_{i})$ in (23) is $1/\alpha$ -strongly convex (Section 3.4 in [28]) for any $\alpha>0$ , and this leads to the following inequality: for any $i\in\mathcal{I}\cup\{I+1\}$ with $\mathcal{I}_{i}\neq\emptyset$ ,

f_{i}(\bm{r}_{J_{i}}^{*})\geq f_{i}(\bm{r}_{i}^{o})+\frac{1}{2\alpha}\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}.

(24)

Based on inequality (24), we now prove the statements (A) and (B) in Lemma 1 as follows.

Proof for Statement (A): For any $s\in\mathcal{I}_{i}$ with $i\in\mathcal{I}\cup\{I+1\}$ , by adding $-\widetilde{Q}_{s}(\bm{r}_{s}^{o})$ to both sides of the inequality (24), substituting $f_{i}(\bm{r}_{J_{i}}^{*})$ and $f_{i}(\bm{r}_{i}^{o})$ according to (23), and reordering the inequality, we have

\displaystyle\begin{split}&~{}-\widetilde{Q}_{s}(\bm{r}_{s}^{o})-\nabla\widetilde{Q}_{J_{i}}(\bm{r}_{J_{i}}^{o})^{\top}(\bm{r}_{i}^{o}-\bm{r}_{J_{i}}^{o})\\ \overset{\text{(a)}}{\leq}&~{}\frac{1}{2\alpha}\left(\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{J_{i}}^{o}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}-\lVert\bm{r}_{i}^{o}-\bm{r}_{J_{i}}^{o}\rVert^{2}\right)\\ &~{}-\widetilde{Q}_{s}(\bm{r}_{s}^{o})-\nabla\widetilde{Q}_{s}(\bm{r}_{s}^{o})^{\top}(\bm{r}_{s}^{*}-\bm{r}_{s}^{o})\\ \overset{\text{(b)}}{\leq}&~{}\frac{1}{2\alpha}\left(\lVert\bm{r}_{J_{i}}^{*}\!-\!\bm{r}_{J_{i}}^{o}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}\!-\!\bm{r}_{i}^{o}\rVert^{2}-\lVert\bm{r}_{i}^{o}\!-\!\bm{r}_{J_{i}}^{o}\rVert^{2}\right)\\ &~{}-\widetilde{Q}_{s}(\bm{r}^{*}_{s}),\end{split}

(25)

where (a) uses the definition of $J_{i}$ in (17), and (b) is due to the concavity of $\widetilde{Q}_{s}(\bm{r}_{s})$ . In inequality (25), $\nabla\widetilde{Q}_{J_{i}}(\bm{r}^{o}_{J_{i}})^{\top}(\bm{r}_{i}^{o}-\bm{r}_{J_{i}}^{o})$ satisfies

\displaystyle\begin{split}&\nabla\widetilde{Q}_{J_{i}}(\bm{r}^{o}_{J_{i}})^{\top}(\bm{r}_{i}^{o}-\bm{r}_{J_{i}}^{o})\leq\lVert\nabla\widetilde{Q}_{J_{i}}(\bm{r}_{J_{i}}^{o})\rVert\lVert\bm{r}_{i}^{o}-\bm{r}_{J_{i}}^{o}\rVert\\ \leq&~{}\frac{\lVert\nabla\widetilde{Q}_{J_{i}}(\bm{r}_{J_{i}}^{o})\rVert^{2}}{2\eta}+\frac{\eta\lVert\bm{r}_{i}^{o}-\bm{r}_{J_{i}}^{o}\rVert^{2}}{2}\\ \leq&~{}\frac{\overline{Q}^{2}}{2\eta}+\frac{\eta\lVert\bm{r}_{i}^{o}-\bm{r}_{J_{i}}^{o}\rVert^{2}}{2},\end{split}

(26)

where $\eta$ is a positive constant.

Substituting (26) into (25) and rearranging it, we obtain

\displaystyle\begin{split}&\widetilde{Q}_{s}(\bm{r}^{*}_{s})\!-\!\widetilde{Q}_{s}(\bm{r}_{s}^{o})\!\overset{\text{(c)}}{\leq}\!~{}\frac{\lVert\bm{r}_{J_{i}}^{*}\!-\!\bm{r}_{J_{i}}^{o}\rVert^{2}\!-\!\lVert\bm{r}_{J_{i}}^{*}\!-\!\bm{r}_{i}^{o}\rVert^{2}}{2\alpha}\!+\!\frac{\alpha}{2}\overline{Q}^{2},\end{split}

(27)

where (c) follows by setting $\eta=1/\alpha$ , i.e., $\eta/2-1/(2\alpha)=0$ . In (27), $\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{J_{i}}^{o}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}$ satisfies

\displaystyle\begin{split}&~{}\lVert\bm{r}^{*}_{J_{i}}-\bm{r}_{J_{i}}^{o}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}\\ =&~{}\lVert\bm{r}^{*}_{J_{i}}-\bm{r}_{J_{i}}^{o}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{o}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert^{2}+\lVert\bm{r}_{J_{i}}^{o}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert^{2}\\ &~{}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}\\ =&~{}\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert\lVert\bm{r}^{*}_{J_{i}}-2\bm{r}_{J_{i}}^{o}+\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert+\lVert\bm{r}_{J_{i}}^{o}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert^{2}\\ &~{}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}\\ \overset{\text{(d)}}{\leq}&~{}2R\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert+\lVert\bm{r}_{J_{i}}^{o}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2},\end{split}

(28)

where (d) is due to $\lVert\bm{r}_{i}-\bm{r}_{j}\rVert\leq R,~{}\forall\bm{r}_{i},\bm{r}_{j}\in{\mathcal{R}}^{K}$ .

Substituting (28) into (27), we obtain inequality (21).

Proof for Statement (B): We first introduce an index $\widehat{I}=\{\widehat{I}>I~{}|~{}\mathcal{I}_{\widehat{I}}\neq\emptyset,\mathcal{I}_{I+1}=\mathcal{I}_{I+2}=\cdots=\mathcal{I}_{\widehat{I}-1}=\emptyset\}$ , i.e., the first segment after segment $I$ such that $\mathcal{I}_{\widehat{I}}\neq\emptyset$ . Given the enlarged set of segments, i.e., $\mathcal{I}\cup\{I+1,\cdots,\widehat{I}\}$ , according to inequality (21) in Lemma 1, we obtain (22).∎

Based on Lemma 1, we show that the dynamic regret of Algorithm 1 is upper-bounded in Theorem 1.

Theorem 1 (Dynamic Regret).

Under Condition 1 and convex decision set $\mathcal{R}$ , the dynamic regret of Algorithm 1 is upper-bounded by

\text{Reg}_{I}\leq\frac{R^{2}\left(1+V_{\emptyset}\right)}{2\alpha}+\frac{RV_{r}}{\alpha}+\frac{\alpha\overline{Q}^{2}I}{2}\\ +\mathbbm{1}({\widetilde{\mathcal{I}}}\neq\emptyset)\frac{KR_{|\mathcal{R}|}}{d^{\emph{\text{min}}}}\left(\frac{3R^{2}}{2\alpha}+\frac{\alpha}{2}\overline{Q}^{2}\right).

(29)

Proof.

Based on Lemma 1, taking the summation over all the segments $s\in\mathcal{I}_{i}$ for all $i\in\mathcal{I}\cup\{I+1\}$ , we have

\displaystyle\begin{split}&~{}\sum_{i\in\mathcal{I}\cup\{I+1\}}\sum_{s\in\mathcal{I}_{i}}\left(\widetilde{Q}_{s}(\bm{r}_{s}^{o})-\widetilde{Q}_{s}(\bm{r}^{*}_{s})\right)\\ {\leq}&~{}\sum_{i\in\mathcal{I}\cup\{I+1\}}\frac{1}{2\alpha}|\mathcal{I}_{i}|\big{(}2R\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert\\ &+\lVert\bm{r}_{J_{i}}^{o}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}\big{)}+\frac{\alpha\overline{Q}^{2}I}{2}\\ \overset{\text{(a)}}{\leq}&~{}\sum_{i\in\mathcal{I}\cup\{I+1\}}\frac{1}{2\alpha}|\mathcal{I}_{i}|\big{(}\lVert\bm{r}_{J_{i}}^{o}-\bm{r}_{J_{i}^{\dagger}}^{*}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}\big{)}\\ &~{}+\frac{RV_{r}}{\alpha}+\frac{\alpha\overline{Q}^{2}I}{2}\\ \overset{\text{(b)}}{\leq}&~{}\frac{R^{2}\left(1+V_{\emptyset}\right)}{2\alpha}+\frac{RV_{r}}{\alpha}+\frac{\alpha\overline{Q}^{2}I}{2}.\end{split}

(30)

Inequality (a) is due to the condition in (20). Inequality (b) holds because the following inequality holds:

\displaystyle\begin{split}&\sum_{i\in\mathcal{I}\cup\{I+1\}}|\mathcal{I}_{i}|\big{(}\lVert\bm{r}_{J_{i}}^{o}-\bm{r}_{J_{i}^{{\dagger}}}^{*}\rVert^{2}-\lVert\bm{r}_{J_{i}}^{*}-\bm{r}_{i}^{o}\rVert^{2}\big{)}\\ \leq&~{}\sum_{i\in\mathcal{I}\cup\{I+1\}}\left(1+\left[|\mathcal{I}_{i}|-1\right]^{+}\right)R^{2}\leq(1+V_{\emptyset})R^{2}.\end{split}

(31)

Taking the summation over all the segments $s\in\widetilde{\mathcal{I}}$ , we have

\displaystyle\begin{split}&~{}\sum_{s\in\widetilde{\mathcal{I}}}\left(\widetilde{Q}_{s}(\bm{r}^{*}_{s})-\widetilde{Q}_{s}(\bm{r}_{s}^{o})\right)\overset{\text{(c)}}{\leq}\frac{KR_{|\mathcal{R}|}}{d^{\text{min}}}\left(\frac{3R^{2}}{2\alpha}+\frac{\alpha}{2}\overline{Q}^{2}\right),\end{split}

(32)

where (c) holds because $(K{R_{|\mathcal{R}|}\beta})/({d^{\text{min}}}{\beta})={KR_{|\mathcal{R}|}}/{d^{\text{min}}}$ is the upper-bound of the cardinality of $\widetilde{\mathcal{I}}$ . By summing up (30) and (32), we can obtain the regret bound (29). ∎

3.4.3 Sublinear Dynamic Regret

Based on Theorem 1, the dynamic regret of Algorithm 1 is sublinear under particular setting of $\alpha$ .

Corollary 1 (Sublinear Dynamic Regret).

Under Condition 1 and convex decision set $\mathcal{R}$ , by setting $\alpha=\alpha_{0}I^{-{1}/{\gamma}}$ , for any $\gamma\in(1,\infty)$ , the dynamic regret of Algorithm 1 is sublinear, i.e., $\lim_{I\rightarrow\infty}\text{Reg}_{I}/I=0$ .

Proof.

For any $\gamma\in(1,\infty)$ ,

\lim_{I\rightarrow\infty}\text{Reg}_{I}/I\leq\lim_{I\rightarrow\infty}\frac{R^{2}\left(1\!+\!V_{\emptyset}\right)}{2\alpha_{0}I^{-\frac{1}{\gamma}}I}+\frac{RV_{r}}{\alpha_{0}I^{-\frac{1}{\gamma}}I}+\frac{\alpha_{0}I^{-\frac{1}{\gamma}}\overline{Q}^{2}}{2}\\ +\mathbbm{1}({\widetilde{\mathcal{I}}}\neq\emptyset)\frac{KR_{|\mathcal{R}|}}{d^{\text{min}}}\left(\frac{3R^{2}}{2\alpha_{0}I^{-\frac{1}{\gamma}}I}+\frac{\alpha_{0}I^{-\frac{1}{\gamma}}}{2I}\overline{Q}^{2}\right)=0,

(33)

and $\lim_{I\rightarrow\infty}\text{Reg}_{I}/I\geq 0$ . ∎

Corollary 1 implies that as the number of segments increases, the performance of the proposed online algorithm approaches the performance under the optimal solutions to the per-segment problems (OPT-SEGMENT-i) for $i\in\mathcal{I}$ .

3.5 Algorithm Modification

Algorithm 1 is capable of learning the user’s FoV preference and time-varying downloading capacity in real time, and it is proven to have sublinear dynamic regret under Condition 1. In practice, however, the downloading capacity may sometimes have significant increase or decrease within a short time (with statistics from a real-world dataset to be shown in Section 4.1), under which Condition 1 may sometimes not be satisfied, and this may induce sudden changes to the bitrate decisions.

To alleviate the impact of the variation of the downloading capacity, we modify Algorithm 1 by restricting the increase or decrease of the bitrate of each tile to be at most one level at each time. Specifically, after computing $\bm{r}^{o}_{i}$ for segment $i\in\mathcal{I}$ according to (16), we compute a modified bitrate decision $\tilde{\bm{r}}^{o}_{i}=(\tilde{{r}}^{o}_{i,k},k\in\mathcal{K})$ , and use it as the bitrate choice of the online algorithm, i.e., assign the value of vector $\tilde{\bm{r}}^{o}_{i}$ to vector ${\bm{r}}^{o}_{i}$ . The bitrate vector $\tilde{\bm{r}}^{o}_{i}$ is defined as follows. Let $\tilde{l}^{o}_{i,k}$ and ${l}^{o}_{i,k}$ denote the bitrate level of the bitrate vectors $\tilde{r}^{o}_{i,k}$ and ${r}^{o}_{i,k}$ , i.e., $R_{\tilde{l}^{o}_{i,k}}=\tilde{r}^{o}_{i,k}$ and $R_{{l}^{o}_{i,k}}={r}^{o}_{i,k}$ , respectively. Given bitrate vectors ${\bm{r}}^{o}_{i}$ and $\tilde{\bm{r}}^{o}_{i-1}$ , vector $\tilde{\bm{r}}^{o}_{i}$ is computed as follows: for all $i\in\mathcal{I}$ and $k\in\mathcal{K}$ ,

\tilde{r}^{o}_{i,k}=\left\{\begin{array}[]{ll}R_{\tilde{l}^{o}_{i-1,k}+1},&{l}^{o}_{i,k}>\tilde{l}^{o}_{i-1,k}+1,\\ R_{\tilde{l}^{o}_{i-1,k}-1},&{l}^{o}_{i,k}<\tilde{l}^{o}_{i-1,k}-1,\\ {r}^{o}_{i,k},&\text{otherwise}.\\ \end{array}\right.

(34)

Intuitively, when the bitrate decision of a tile computed according to (16) is increased (or decreased) by more than one level when compared with the bitrate of the tile of the previous segment, we restrict the increase (or decrease) of the bitrate to be one level so as to avoid the aggressive bitrate change caused by the significant variation of the downloading capacity.

4 Performance Evaluation

In this section, we evaluate the performance of our proposed OBS360 algorithm. In the simulations, we use two open datasets to simulate the 360 degree video streaming scenarios: we use the dataset in [24] to simulate the users’ downloading capacities, and use the dataset in [25] to simulate users’ FoVs. The coefficients are as follows: $l^{\textsc{rb}}=0.5$ , $l^{\textsc{bd-e}}=0.1$ , and $l^{\textsc{bd-a}}=0.1$ . In addition, initial buffer occupancy $b^{\textsc{ini}}$ is set to two seconds, segment length $\beta$ is set to one second, and parameter $\alpha$ is set to one.

In the following, we first introduce the two open datasets from [24] and [25]. Then, we present the simulation results, including the video streaming instance using the proposed OBS360 algorithm, the comparison between OBS360 algorithm and the offline optimal performance, and the comparison between OBS360 algorithm and benchmark methods.

4.1 Datasets

The dataset in [24] contains the bandwidth measurement results in 4G networks in the city of Ghent, Belgium. There are a total of 40 logs, which are collected on various transportation, such as on foot, bicycle, bus, and train. Each log has a duration ranging from 166 to 758 seconds. In our simulation, we select one sample within each second to compute the downloading capacity of the second. Fig. 5 (a) shows the cumulative distribution function (CDF) of the downloading capacity of these selected samples. As shown in the figure, $50\%$ of the downloading capacities are below 33 Mbps, and $80\%$ of them are below 48 Mbps. Fig. 5 (b) shows the CDF of the absolute downloading capacity difference between adjacent selected samples. As shown in the figure, more than $20\%$ of the adjacent samples have a downloading capacity difference that is larger than $15$ Mbps. This implies that the downloading capacity can have significant sudden changes, which makes it challenging for a video player to decide the bitrate of each tile for VA360 video streaming services.

The dataset in [25] contains the traces of 8 videos. For each video, there is one recommended viewing trace (marked by professional filemakers) and 20 actual viewing traces from 20 users. In the dataset, either the recommended trace or each user’s trace of watching a video is represented by a viewing degree trace $((p_{1},y_{1}),(p_{2},y_{2}),\ldots,(p_{I},y_{I}))$ , where $p_{i}\in[-90^{\circ},90^{\circ}]$ and $y_{i}\in[-180^{\circ},180^{\circ}]$ are the pitch (vertical degree) and yaw (horizontal degree) of the corresponding viewport of segment $i=1,2,\ldots,I$ , respectively. The actual viewing degree trace is then transferred to a viewing FoV trace $\bm{\omega}=(\omega_{i,k},i\in\mathcal{I},k\in\mathcal{K})$ according to the definition of FoV in Section 2.2.3, with the recommended viewing FoV as the reference FoV. Fig. 6 shows the average overlap between the users’ FoVs and the reference FoV of different videos, where the x-axis corresponds to the titles or title abbreviations of the videos. As shown in Fig. 6, these videos have different average overlap percentages, while most of the percentages are around $60\%$ .

4.2 Streaming Instance Using OBS360 Algorithm

To visualize the bitrate selection of the proposed OBS360 algorithm, in Fig. 7, we show a video streaming scheduling instance under a downloading capacity trace from the dataset in [24] and a FoV trace from the dataset in [25]. In this simulation, we consider a simplified setting of two tiles placed in a one by two grid. This simplification is for illustrating the bitrate selection of each tile and obtaining intuitions from the video streaming scheduling. We consider each user can see half of the whole view at any time, and the available bitrate set for each tile is $\mathcal{R}=\{1,2.5,5,8,16,40\}$ (Mbps). As a result, if all the tiles for a segment have the same bitrate from set $\mathcal{R}$ , the viewing bitrate of the segment is in set $\{1,2.5,5,8,16,40\}$ (Mbps), as the recommended bitrate set in YouTube [29]. The initial bitrate of each tile is set to be 5 Mbps. Note that such a setting is for demonstration, and a more realistic case with 16 tiles, as in [5, 6], will be evaluated in Section 4.4.

Fig. 7 (a) shows a downloading capacity trace, where the difference between adjacent capacity samples can be up to 20 Mbps. This trace is from [24] and is modified to have an increasing trend across the real time for the convenience of performance demonstration. Fig. 7 (b) shows the user’s FoV trace. This FoV varies dramatically across the playback time, and majority parts of the FoV are on tile 1 at most of the playback time. Fig. 7 (c) shows the bitrate selection result of the proposed OBS360 algorithm. As shown in the figure, the bitrates of both tiles are initialized as 5 Mbps, so the viewing bitrate is 5 Mbps at the beginning of the streaming. Through learning, the algorithm gradually increases the bitrate of tile 1 from 5 Mbps to 16 Mbps at around 6 seconds (playback time), and hence the user’s viewing bitrate significantly increases to around 16 Mbps (approximately a 2K video [29]). After around 42 seconds (real time), the downloading capacity significantly increases to around 40 Mbps, so the bitrate of tile 1 further increases, and the user’s viewing bitrate increases to around 40 Mbps (approximately a 4K video [29]). In terms of the rebufferring, it happens at around 8, 9, 22, and 23 seconds (real time), which is mainly due to the sudden decrease of the downloading capacity at around 7 and 21 seconds. Note that despite the significant varying of the downloading capacity in the trace, each of the rebuffering lasts only a few hundred milliseconds, which is hardly noticeable by the user [30].

4.3 OBS360 and Offline Optimal Solution

We compare our proposed OBS360 algorithm with the offline optimal solution, i.e., the optimal solution to problem (11) when the user’s FoV and downloading capacity are known beforehand. Due to the computational complexity of computing the offline optimal solution, we consider the simplified VA360 video streaming setting as in Section 4.2. The simulations are performed based on the traces of all eight videos in the dataset from [25]. For each video, we consider 20 users, each corresponding to a downloading capacity trace from the dataset in [24] and a FoV trace from the dataset in [25]. The performance is shown in Fig. 8. The bars show the corresponding average values over the users with different FoVs and downloading capacities, and the error bars show the corresponding variance across the users.

As shown in Fig. 8 (a), the proposed OBS360 algorithm achieves $90.1\%$ of the user’s QoE of the offline optimal solution on average. Specifically, the proposed algorithm achieves a viewing bitrate of around 16 Mbps (approximately a 2K video [29]), which is $75.8\%$ of the viewing bitrate of the offline optimal performance, as shown in Fig. 8 (b). In addition, it achieves a lower inter-segment degradation, a lower intra-segment degradation, and a lower rebuffering than the offline optimal solution, as shown in Figs. 8 (c)-(e). Intuitively, when compared with the offline optimal performance, although the proposed OBS360 algorithm has a lower viewing bitrate, it avoids frequent rebuffering and bitrate degradation.

4.4 OBS360 and Benchmark Methods

We perform simulations with the open datasets from [24] and [25] to compare the performance of our proposed OBS360 algorithm with the BAS-360^∘ algorithm in [5] and the Flare algorithm in [6]. The Flare and BAS-360^∘ algorithms are bitrate selection algorithms based on user viewing history. The Flare algorithm aims to maximize the predicted viewing bitrate subject to bandwidth constraints, and BAS-360^∘ algorithm aims to minimize the bandwidth waste (i.e., the bandwidth that can be further utilized without inducing rebuffering and the bandwidth used for downloading the unviewed tiles).

In our simulations, each segment is divided into 16 tiles in a four by four grid, and a user can see one quarter of the whole view at any time, as in [5, 6]. The available bitrate set for each tile is $\mathcal{R}=\{0.25,0.625,1.25,2,4,10\}$ (Mbps). In this case, if all the tiles for a segment have the same bitrate from set $\mathcal{R}$ , the viewing bitrate of the segment is within set $\{1,2.5,5,8,16,40\}$ (Mbps), which is consistent with the recommended bitrate set in YouTube[29]. We perform the simulations based on the traces of all eight videos [25], and consider 20 users, each corresponding to a downloading capacity trace from the dataset in [24] and a FoV trace from the dataset in [25]. Fig. 9 shows the performance comparisons between our proposed algorithm and the benchmark methods.

Fig. 9 (a) shows the comparison regarding the users’ QoE. Specifically, the proposed OBS360 outperforms Flare and BAS-360^∘ significantly for all the videos. In addition, the performances of the benchmark methods highly depend on the average overlap percentages between the users’ FoVs and the reference FoV of the videos, while the performance of the proposed algorithm does not. For example, for videos ‘cin.’ and ‘luth’, their overlap percentages are the largest among all videos (in Fig. 6), and their corresponding QoE using the benchmark methods are the largest among all videos (in Fig. 9 (a)) as well. In comparison, our proposed OBS360 algorithm achieves similar user’s QoE under the various videos, regardless of the average overlap percentages of those videos. This shows that the proposed algorithm can learn the users’ preferences over the FoV, and adapt the bitrates according to learned preferences.

Fig. 9 (b) shows that the proposed OBS360 algorithm can improve the viewing bitrate significantly by $24.6\%-58.8\%$ when compared with the benchmark methods. This improvement is mainly due to the capability of the proposed algorithm on learning the users’ FoV preferences. Figs. 9 (c) and (d) show that the proposed OBS algorithm can significantly reduce the inter-segment and intra-segment degradation when compared with the benchmark methods. The inter-segment degradation is reduced by $83.3\%-91.0\%$ , and the intra-segment degradation is reduced by $84.1\%-89.1\%$ . Fig. 9 (e) shows the comparison regarding the rebuffering per segment. Although the proposed OBS360 algorithm has larger rebuferring than the Flare and BAS-360^∘ algorithms, the absolute values of the rebuffering are quite small (i.e., less than 20 milliseconds per one-second segment), which can hardly be noticed by the users.

In summary, our proposed algorithm can significantly improve the users’ QoE when compared with the benchmark methods. This is achieved by increasing the user’s viewing bitrate and reducing the inter-segment and intra-segment degradation losses.

5 Conclusion

In this work, we considered a scenario with a newly generated 360-degree video without viewing history from other users. We proposed an OBS360 algorithm to optimize the user’s QoE. The proposed online algorithm can adapt to the unknown and heterogeneous users’ FoVs and downloading capacities. In addition, the algorithm was proven to have sublinear dynamic regret under a convex decision set, which provides an intuition that as the number of segments increases, the performance of the online algorithm approaches the offline optimal performance. We performed simulations with real-world datasets regarding the users’ FoVs and downloading capacities. The results show that our proposed algorithm achieves $90.1\%$ of the users’ QoE of the offline optimal performance. In addition, when compared with Flare and BAS-360^∘ algorithms from existing works, our proposed algorithm achieves a higher QoE through improving the viewing bitrate and reducing the inter-segment and intra-segment degradation losses of the users.

The results in this paper can be extended in the following directions. First, the proposed algorithm achieves sublinear dynamics regret under a convex decision set. It is interesting to design an algorithm that can achieve sublinear dynamic regret under a nonconvex decision set. This will be challenging, because the convex decision set is an important condition in online convex optimization for ensuring the sublinearity. Second, it is interesting to design a bitrate selection algorithm by both learning the user’s features (i.e., FoV preference and downloading capacity) and exploiting the viewing history of other users.

References

[1] “Sony PlayStation VR,” https://www.playstation.com/en-au/explore/playstation-vr/, Accessed on Feb. 10, 2020.
[2] M. Hosseini and V. Swaminathan, “Adaptive 360 VR video streaming: Divide and conquer,” in Proc. IEEE International Symposium on Multimedia (ISM), San Jose, CA, Dec. 2016.
[3] L. D’Acunto, J. van den Berg, E. Thomas, and O. Niamut, “Using MPEG DASH SRD for zoomable and navigable video,” in Proc. ACM Int’l Conf. on Multimedia Systems (MMSys), Klagenfurt, Austria, May 2016.
[4] T. Ballard, C. Griwodz, R. Steinmetz, and A. Rizk, “RATS: Adaptive 360-degree live streaming,” in Proc. ACM Int’l Conf. on Multimedia Systems (MMSys), Amherst, MA, Jun. 2019.
[5] M. Xiao, C. Zhou, V. Swaminathan, Y. Liu, and S. Chen, “BAS-360^∘: Exploring spatial and temporal adaptability in 360-degree videos over HTTP/2,” in Proc. IEEE INFOCOM, Honolulu, HI, Apr. 2018.
[6] F. Qian, B. Han, Q. Xiao, and V. Gopalakrishnan, “Flare: Practical viewport-adaptive 360-degree video streaming for mobile devices,” in Proc. ACM MobiCom, New Delhi, India, Oct. 2018.
[7] C. Zhou, M. Xiao, and Y. Liu, “Clustile: Toward minimizing bandwidth in 360-degree video streaming,” in Proc. IEEE INFOCOM, Honolulu, HI, Apr. 2018.
[8] Z. Jiang, X. Zhang, W. Huang, H. Chen, Y. Xu, J. N. Hwang, Z. Ma, and J. Sun, “A hierarchical buffer management approach to rate adaptation for 360-degree video streaming,” IEEE Trans. Veh. Technol., 2019 (Early Access).
[9] L. Xie, X. Zhang, and Z. Guo, “CLS: A cross-user learning based system for improving QoE in 360-degree video adaptive streaming,” in Proc. ACM Int’l Conf. on Multimedia (MM), Seoul, Republic of Korea, Oct. 2018.
[10] L. Sun, F. Duanmu, Y. Liu, Y. Wang, Y. Ye, H. Shi, and D. Dai, “A two-tier system for on-demand streaming of 360 degree video over dynamic networks,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 9, no. 1, pp. 43–57, Mar. 2019.
[11] H. Yuan, S. Zhao, J. Hou, X. Wei, and S. Kwong, “Spatial and temporal consistency-aware dynamic adaptive streaming for 360-degree videos,” IEEE J. Sel. Topics Signal Process., 2019 (Early Access).
[12] J. Le Feuvre and C. Concolato, “Tiled-based adaptive streaming using MPEG-DASH,” in Proc. ACM Int’l Conf. on Multimedia Systems (MMSys), Klagenfurt, Austria, May 2016.
[13] S. Shi, V. Gupta, M. Hwang, and R. Jana, “Mobile VR on edge cloud: A latency-driven design,” in Proc. ACM Int’l Conf. on Multimedia Systems (MMSys), Amherst, MA, Jun. 2019.
[14] D. V. Nguyen, H. T. T. Tran, A. T. Pham, and T. C. Thang, “An optimal tile-based approach for viewport-adaptive 360-degree video streaming,” IEEE Trans. Emerg. Sel. Topics Circuits Syst., vol. 9, no. 1, pp. 29–42, Mar. 2019.
[15] Y. Zhang, P. Zhao, K. Bian, Y. Liu, L. Song, and X. Li, “DRL360: 360-degree video streaming with deep reinforcement learning,” in Proc. IEEE INFOCOM, Paris, France, Apr. 2019.
[16] H. Pang, C. Zhang, F. Wang, J. Liu, and L. Sun, “Towards low latency multi-viewpoint 360^∘ interactive video: A multimodal deep reinforcement learning approach,” in Proc. IEEE INFOCOM, Paris, France, Apr. 2019.
[17] E. Hazan, “Introduction to online convex optimization,” Foundations and Trends in Optimization, vol. 2, no. 3-4, pp. 157–325, Aug. 2016.
[18] S. Shalev-Shwartz, “Online learning and online convex optimization,” Foundations and Trends in Machine Learning, vol. 4, no. 2, pp. 107–194, Feb. 2011.
[19] A. Mokhtari, S. Shahrampour, A. Jadbabaie, and A. Ribeiro, “Online optimization in dynamic environments: Improved regret rates for strongly convex problems,” in Proc. IEEE CDC, Las Vegas, NV, Dec. 2016.
[20] T. Chen, Q. Ling, and G. B. Giannakis, “An online convex optimization approach to proactive network resource allocation,” IEEE Trans. Signal Process., vol. 65, no. 24, pp. 6350–6364, Dec. 2017.
[21] S. Paternain and A. Ribeiro, “Online learning of feasible strategies in unknown environments,” IEEE Trans. Autom. Control, vol. 62, no. 6, pp. 2807–2822, Nov. 2017.
[22] E. C. Hall and R. M. Willett, “Online convex optimization in dynamic environments,” IEEE J. Sel. Topics Signal Process., vol. 9, no. 4, pp. 647–662, Jun. 2015.
[23] A. Jadbabaie, A. Rakhlin, S. Shahrampour, and K. Sridharan, “Online optimization: Competing with dynamic comparators,” in Proc. Int’l Conf. on Artificial Intelligence and Statistics (AISTATS), San Diego, CA, May 2015.
[24] J. Van Der Hooft, S. Petrangeli, T. Wauters, R. Huysegems, P. R. Alface, T. Bostoen, and F. De Turck, “HTTP/2-based adaptive streaming of HEVC video over 4G/LTE networks,” IEEE Commun. Lett., vol. 20, no. 11, pp. 2177–2180, Nov. 2016.
[25] S. Knorr, C. Ozcinar, C. O. Fearghail, and A. Smolic, “Director’s cut: A combined dataset for visual attention analysis in cinematic VR content,” in Proc. ACM SIGGRAPH, London, United Kingdom, Dec. 2018.
[26] T.-Y. Huang, R. Johari, N. McKeown, M. Trunnell, and M. Watson, “A buffer-based approach to rate adaptation: Evidence from a large video streaming service,” in Proc. ACM SIGCOMM, Chicago, IL, Aug. 2014.
[27] Speedtest, “Speedtest market report: Canada average mobile download speed based on Q2-Q3 data in 2019,” https://www.speedtest.net/reports/canada/, Accessed on Feb. 10, 2020.
[28] S. Bubeck, “Theory of convex optimization for machine learning,” arXiv preprint arXiv:1405.4980, 2014.
[29] YouTube Help, “Recommended upload encoding settings,” https://support.google.com/youtube/answer/1722171?hl=en, Accessed on Feb. 10, 2020.
[30] T. De Pessemier, K. De Moor, W. Joseph, L. De Marez, and L. Martens, “Quantifying the influence of rebuffering interruptions on the user’s quality of experience during mobile video watching,” IEEE Trans. Broadcast., vol. 59, no. 1, pp. 47–61, Mar. 2012.