This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Cache Allocations for Consecutive Requests of Categorized Contents: Service Provider’s Perspective

Minseok Choi†∗, Andreas F. Molisch, Dong-Jun Han, Joongheon Kim, and Jaekyun Moon Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, USA
School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea
School of Electrical Engineering, Korea University, Seoul, South Korea
E-mails: [email protected], [email protected][email protected], [email protected],
[email protected]
Abstract

In wireless caching networks, a user generally has a concrete purpose of consuming contents in a certain preferred category, and requests multiple contents in sequence. While most existing research on wireless caching and delivery has focused only on one-shot requests, the popularity distribution of contents requested consecutively is definitely different from the one-shot request and has been not considered. Also, especially from the perspective of the service provider, it is advantageous for users to consume as many contents as possible. Thus, this paper proposes two cache allocation policies for categorized contents and consecutive user demands, which maximize 1) the cache hit rate and 2) the number of consecutive content consumption, respectively. Numerical results show how categorized contents and consecutive content requests have impacts on the cache allocation.

I Introduction

In multimedia services, e.g., on-demand streaming services, a relatively small number of popular contents generally occupies a large portion of the massive global data traffic, and most of user demands are overlapped and repeated [1]. To deal with this issue, wireless caching technologies have been studied, wherein the base station (BS) or the server pushes popular contents for off-peak hours to cache-enabled nodes so that these nodes provide contents directly to nearby mobile users during peak hours [2]. In practice, caching nodes (i.e., caching helpers and/or cache-enabled devices) have finite storage sizes, which leads the content placement problem to determine which content is better to be stored in caching nodes [3].

The goal of the content placement problem is to find optimal caching policies according to the popularity distribution of contents and network topology. In stochastic wireless caching networks, there exist research efforts on probabilistic content placement introduced in [4, 5]. Many probabilistic caching methods have been proposed for various systems, e.g., device-to-device (D2D) networks [6], NN-tier hierarchical networks [7], multi-quality dynamic streaming [8], and probabilistic coded caching was also recently proposed in [9].

Previous research optimized content placement for users requesting only one content, i.e., one-shot request, under the assumption that all content requests are independent. In multimedia services, the user typically accesses a service platform with the purpose of consuming a specific category of the content, and generally consumes more than one content. In this case, the sequence of consecutively consumed contents is highly correlated. For example, in video services such as YouTube, the related video list is recommended to the user after the first content is consumed [10]. In particular, the view count of a given video varies in almost the same scale as the average view count of the top referrer videos in the related list [11]; therefore, a user is highly likely to request one of the videos in the related category. Therefore, this paper considers the scenario in which a user consecutively requests multiple contents that are likely to be in its preferred category. In this context, this paper proposes two cache allocation policies for categorized contents and consecutive user demands, which maximize the cache hit rate and the expected number of consecutive content consumptions, respectively.

The previous work of [12] has proposed a caching policy for consecutive user demands with the assumption that the number of content requests is fixed. This assumption does not allow the service provider to maximize user’s content consumption; however, in the perspective of service providers, it is important to satisfy as many of the user’s requests as possible. In this paper, each user determines whether to continue to consume more contents depending on cache states in its vicinity, and the service provider aims at making users stay in the service longer.

The main contributions are as follows:

  • Different from most existing results on wireless caching in which one-shot requests are considered only, consecutive requests of categorized contents are considered. In practice, the sequence of consecutively consumed contents is highly correlated, and an advanced caching scheme is required.

  • Based on real data set, the recent work of [13] has modeled the category-based conditional content popularity distribution. This paper uses this measurement-based popularity model to obtain the proposed cache allocation rule for consecutive requests of categorized contents.

  • This paper proposes two cache allocation schemes which maximize cache hit rates and the expected number of consecutive content requests, respectively. The iterative algorithm is presented to find the optimal cache allocation rules and its convergence is proved.

  • Numerical results show how 1) the popularity concentration to the preferred category and 2) different numbers of contents in the different categories influence the cache allocation rule.

The rest of the paper is organized as follows. The system model is described in Section II. The cache allocation rules maximizing the cache hit probability and the number of consecutive content consumption are proposed in Sections III and IV, respectively. The numerical results are shown in Section V and Section VI concludes the paper.

II System Model

II-A Wireless Caching Network

This paper assumes that caching nodes are randomly distributed according to a general spatial distribution Φ\Phi, and the server which has a content library 𝒩\mathcal{N} pushes some popular contents to each caching node during off-peak hours. Suppose that a library 𝒩\mathcal{N} consists of NN contents and all contents have a normalized unit size. Let all NN contents be grouped into KK categories, and NiN_{i} contents are in category ii denoted by 𝒞i\mathcal{C}_{i}, for all i𝒦={1,,K}i\in\mathcal{K}=\{1,\cdots,K\}, satisfying i=1KNi=N\sum_{i=1}^{K}N_{i}=N. Also, denote the content index set of 𝒞i\mathcal{C}_{i} by 𝒩i={1,,Ni}\mathcal{N}_{i}=\{1,\cdots,N_{i}\}.

The caching nodes have the finite storage size MM, which means only MM contents can be cached in each node. Since N>MN>M in practice, caching nodes store a part of contents in 𝒩\mathcal{N}. A user requests the content from caching nodes in its vicinity. If the user finds at least one caching node that stores the desired content, this case is called the cache hit. When the user requests multiple contents, we define the cache hit as the case where all of requested contents can be found in nearby caching nodes. When there is no caching node having some of the requested contents, the server can deliver them via a cellular link. However, this paper assumes that the transmission quality of the cellular link is insufficient due to delay and/or congestion that leads to unacceptable video quality, so that henceforth we do not consider direct transmission from the server.

Let the storage size MM be divided into KK fractions with unequal sizes denoted by αi\alpha_{i} for all i𝒦i\in\mathcal{K}, and contents in 𝒞i\mathcal{C}_{i} are stored within αi\alpha_{i}. These fractions will be called cache allocations for categories and satisfy i=1KαiM\sum_{i=1}^{K}\alpha_{i}\leq M and αiNi\alpha_{i}\leq N_{i}. Given all of 𝜶=(α1,,αK)\boldsymbol{\alpha}=(\alpha_{1},\cdots,\alpha_{K}), how to store individual contents within each category becomes a classical content placement problem, and we consider the probabilistic caching policy for individual contents as shown in [4, 5].

TABLE I: Key notations
KK Number of categories
NiN_{i} Number of contents in category ii
kk Index of the preferred category
MM Cache size
αi\alpha_{i} Cache allocation for category ii
fif_{i} Global popularity of category ii
p1p_{1} Popularity of the preferred category
rr Rank of category that requests contents
ll Number of consecutive content requests
lrl_{r} Number of requested contents in the rr-ranked category
iri_{r} Category index of the rr-ranked category
ϵ\epsilon Probability of not requesting the next content

II-B Content Popularity Model

This paper focuses on the scenario in which the user requests multiple contents consecutively, different from most of existing caching policies which considered only one-shot requests. A representative example is a video streaming service. For example, a user can access the service platform with a concrete purpose of watching some sports highlight clips. In this case, we can postulate that sports is the user’s preferred category, therefore the probability of requesting sports videos in sequence is very high. In contrast, the probability of requesting contents in other categories, e.g., movie trailers, is very small although not zero.

Accordingly, the content request can be modeled by the following steps. First, the user randomly picks one category in 𝒦\mathcal{K}. Each category i𝒦i\in\mathcal{K} has a global category popularity, which follows the Zipf distribution [4]: fi=iγ/j=1Kjγf_{i}=i^{-\gamma}/\sum_{j=1}^{K}j^{-\gamma} where γ\gamma denotes the popularity distribution skewness. Then, the selected category has the first rank among all categories; note that the global category popularity is only used for choosing the first rank. Other categories can have any rank except for the first rank, but this paper models all categories that are not the first for this particular user as statistically equivalent; in other words, the relative ranking from 2,,K2,\cdots,K does not matter. After determining the preferred category, the user chooses one of categories to request the content depending on their ranks. Again, the category rank distribution given the preferred category is assumed to follow the Zipf distribution, i.e., Pr{R=r}=rγout/j=1Kjγout\mathrm{Pr}\{R=r\}=r^{-\gamma^{\text{out}}}/\sum_{j=1}^{K}j^{-\gamma^{\text{out}}}, which represents the popularity of the rr-th ranked category and γout\gamma^{\text{out}} is the Zipf factor. We denote the popularity of the preferred category by p1=Pr{R=1}p_{1}=\mathrm{Pr}\{R=1\}. Note that p1p_{1} is the probability of staying within the given preferred category, not the general probability of picking the 1st-ranked category as in [13], which is a different quantity. Here, we also consider the situation in which the user can stop to request contents by itself with small probability of ϵ\epsilon. Therefore, the probability of requesting any content in the rr-th ranked category after consuming the first content becomes (1ϵ)Pr{R=r}(1-\epsilon)\cdot\mathrm{Pr}\{R=r\}.

After choosing the category rank, the user requests the specific content in the category having the chosen rank. According to [13], the category-based conditional popularity distribution of contents in 𝒞i\mathcal{C}_{i} follows the Mandelbrot-Zipf (M-Zipf) distribution, i.e.,

ai,n=1(n+cin)γiinm=1Ni1(m+cin)γiin,a_{i,n}=\frac{\frac{1}{(n+c^{\text{in}})^{\gamma_{i}^{\text{in}}}}}{\sum_{m=1}^{N_{i}}\frac{1}{(m+c^{\text{in}})^{\gamma_{i}^{\text{in}}}}}, (1)

which represents the popularity of the nn-th content in 𝒞i\mathcal{C}_{i} for n𝒩in\in\mathcal{N}_{i}. γiin\gamma_{i}^{\text{in}} and cinc^{\text{in}} are the Zipf factor and the plateau factor of 𝒞i\mathcal{C}_{i}, respectively. Here, if γout\gamma^{\text{out}} is sufficiently large, p11ϵp1p_{1}\gg 1-\epsilon-p_{1} and the popularity of contents in the preferred 𝒞k\mathcal{C}_{k} is much larger than that of any content in 𝒞i\mathcal{C}_{i} for all iki\neq k. Fig. 1 shows popularity distribution of 100 contents given the preferred category, grouped into 5 categories consisting of 20 contents. This figure is obtained by multiplying the rank probability and the category-based individual content popularity, when γout=5\gamma^{\text{out}}=5, γiin=2.4\gamma_{i}^{\text{in}}=2.4 and cin=69c^{\text{in}}=69. Among them, contents whose indices are from 1 to 20 belong to the first-ranked category, and their popularity is relatively much larger than others. Therefore, if γout\gamma^{\text{out}} is sufficiently large, we approximate the popularity of contents outside the preferred 𝒞k\mathcal{C}_{k} as uniform distribution, i.e., ai,nk1NNka^{k}_{i,n}\approx\frac{1}{N-N_{k}} for all i𝒦{k}i\in\mathcal{K}\setminus\{k\} and n𝒩in\in\mathcal{N}_{i}, irrespective of ranks of those categories. When given 𝒞k\mathcal{C}_{k}, the popularity of contents in 𝒞k\mathcal{C}_{k} is ak,nk=ak,na^{k}_{k,n}=a_{k,n} for n𝒩kn\in\mathcal{N}_{k} still. Thus, consideration of two exclusive sets of the preferred category and all other contents is reasonable.

III Maximization of Cache Hit Probability

This section derives the cache hit probability and proposes a cache allocation rule that maximizes the cache hit probability.

III-A Cache hit probability

Suppose that the user request ll contents in sequence. Among ll contents, let lrl_{r} contents belong to the rr-th ranked category satisfying r=1Klr=l\sum_{r=1}^{K}l_{r}=l. Then, when the preferred category 𝒞k\mathcal{C}_{k} is given, the cache hit probability given ll content requests, i.e., the probability that all of ll requested contents can be delivered from any caching node, can be expressed as

phitl,k=r=1K[Pr{R=r}hirk(αir)]lr,p_{\text{hit}}^{l,k}=\prod_{r=1}^{K}\Big{[}\mathrm{Pr}\{R=r\}h^{k}_{i_{r}}(\alpha_{i_{r}})\Big{]}^{l_{r}}, (2)

where ir𝒦i_{r}\in\mathcal{K} is the index of the rr-th ranked category and

hik(αi)=1n=1Niai,nkj=0PrΦ{J=j}(1bi,n(αi))j,h^{k}_{i}(\alpha_{i})=1-\sum_{n=1}^{N_{i}}a^{k}_{i,n}\sum_{j=0}^{\infty}\mathrm{Pr}_{\Phi}\{J=j\}(1-b_{i,n}(\alpha_{i}))^{j}, (3)

is the cache hit probability of a content request within 𝒞i\mathcal{C}_{i} given αi\alpha_{i} when 𝒞k\mathcal{C}_{k} is the preferred category. In Eq. (3), PrΦ{J=j}\mathrm{Pr}_{\Phi}\{J=j\} is the probability that there are jj caching nodes storing the requested content in the vicinity of the user. Also, bi,n(αi)b_{i,n}(\alpha_{i}) is the caching probability of the nn-th content in 𝒞i\mathcal{C}_{i} given αi\alpha_{i}. However, computations required for scanning all combinations of lrl_{r} values are exponentially increasing as ll grows.

When γout\gamma^{\text{out}} is large, phitl,kp_{\text{hit}}^{l,k} is simplified by using approximations of ai,nk1NNka^{k}_{i,n}\approx\frac{1}{N-N_{k}}, i𝒦{k}\forall i\in\mathcal{K}\setminus\{k\} and n𝒩in\in\mathcal{N}_{i} into

phitl,km=0l(lm)[p1hkk(αk)]m[(1ϵp1)qk(𝜶)]lm,p_{\text{hit}}^{l,k}\approx\sum_{m=0}^{l}\binom{l}{m}[p_{1}h^{k}_{k}(\alpha_{k})]^{m}\cdot[(1-\epsilon-p_{1})q_{k}(\boldsymbol{\alpha})]^{l-m}, (4)

where

qk(𝜶)=1i=1ikKn=1Ni1NNkj=0PrΦ{J=j}(1bi,n(αi))j\displaystyle q_{k}(\boldsymbol{\alpha})=1-\sum_{\begin{subarray}{c}i=1\\ i\neq k\end{subarray}}^{K}\sum_{n=1}^{N_{i}}\frac{1}{N-N_{k}}\sum_{j=0}^{\infty}\mathrm{Pr}_{\Phi}\{J=j\}(1-b_{i,n}(\alpha_{i}))^{j} (5)

which is the cache hit probability of a content request outside 𝒞k\mathcal{C}_{k}. Each term in Eq. (4) is the probability that among ll requested contents, mm are in 𝒞k\mathcal{C}_{k} and lml-m contents are outside 𝒞k\mathcal{C}_{k}, and all of ll contents can be found in caching nodes in the vicinity of the user. For simplicity, we will use the notation hk(αk)=hkk(αk)h_{k}(\alpha_{k})=h_{k}^{k}(\alpha_{k}) in the following sections.

The expected cache hit probability can be finally derived as

phit\displaystyle p_{\text{hit}} =k=1Kfkl=1Pr{L=l}m=0l(lm)[p1hk(αk)]m\displaystyle=\sum_{k=1}^{K}f_{k}\sum_{l=1}^{\infty}\mathrm{Pr}\{L=l\}\cdot\sum_{m=0}^{l}\binom{l}{m}[p_{1}h_{k}(\alpha_{k})]^{m}
[(1ϵp1)qk(𝜶)]lm,\displaystyle~{}~{}~{}~{}~{}\cdot[(1-\epsilon-p_{1})q_{k}(\boldsymbol{\alpha})]^{l-m}, (6)

where Pr{L=l}\mathrm{Pr}\{L=l\} is the probability that the user requests ll contents in sequence, which is given by

Pr{L=l}=ϵ(1ϵ)l.\mathrm{Pr}\{L=l\}=\epsilon(1-\epsilon)^{l}. (7)

Therefore, the cache hit probability is arranged into

phit\displaystyle p_{\text{hit}} =k=1Kfkl=1ϵ(1ϵ)l(p1hk(αk)+(1ϵp1)qk(𝜶))l\displaystyle=\sum_{k=1}^{K}f_{k}\sum_{l=1}^{\infty}\epsilon(1-\epsilon)^{l}(p_{1}h_{k}(\alpha_{k})+(1-\epsilon-p_{1})q_{k}(\boldsymbol{\alpha}))^{l} (8)
=k=1Kfk(1ϵ)ϵ(p1hk(αk)+(1ϵp1)qk(𝜶))1(1ϵ)(p1hk(αk)+(1ϵp1)qk(𝜶))\displaystyle=\sum_{k=1}^{K}\frac{f_{k}(1-\epsilon)\epsilon(p_{1}h_{k}(\alpha_{k})+(1-\epsilon-p_{1})q_{k}(\boldsymbol{\alpha}))}{1-(1-\epsilon)(p_{1}h_{k}(\alpha_{k})+(1-\epsilon-p_{1})q_{k}(\boldsymbol{\alpha}))} (9)
Refer to caption
Figure 1: Popularity of contents given the preferred category

In Eq. (9), any caching policy can be utilized within each category given the preferred 𝒞k\mathcal{C}_{k} and 𝜶\boldsymbol{\alpha}, and hk(αk)h_{k}(\alpha_{k}) and qk(𝜶)q_{k}(\boldsymbol{\alpha}) are determined depending on the caching policy. Then, we can suppose that the caching policy that maximizes the cache hit rate [4, 5] is used for caching of individual contents within every category. Denote the maximum cache hit rates in and outside 𝒞k\mathcal{C}_{k} by hk(αk)h_{k}^{*}(\alpha_{k}) and qk(𝜶)q_{k}^{*}(\boldsymbol{\alpha}), respectively. Then, the cache hit probability of ll content requests becomes

phit\displaystyle p^{*}_{\text{hit}} =k=1Kfk(1ϵ)ϵ(p1hk(αk)+(1ϵp1)qk(𝜶))1(1ϵ)(p1hk(αk)+(1ϵp1)qk(𝜶))\displaystyle=\sum_{k=1}^{K}\frac{f_{k}(1-\epsilon)\epsilon(p_{1}h^{*}_{k}(\alpha_{k})+(1-\epsilon-p_{1})q^{*}_{k}(\boldsymbol{\alpha}))}{1-(1-\epsilon)(p_{1}h^{*}_{k}(\alpha_{k})+(1-\epsilon-p_{1})q^{*}_{k}(\boldsymbol{\alpha}))} (10)
=k=1Kgk(𝜶,fk,ϵ).\displaystyle=\sum_{k=1}^{K}g^{*}_{k}(\boldsymbol{\alpha},f_{k},\epsilon). (11)

III-B Problem Formulation

The optimal cache allocations of 𝜶=(α1,,αK)\boldsymbol{\alpha}^{\star}=(\alpha_{1}^{\star},\cdots,\alpha_{K}^{\star}) can be obtained by maximizing (11) as follows:

𝜶=argmax𝜶phit\displaystyle\boldsymbol{\alpha}^{\star}=\underset{\boldsymbol{\alpha}}{\arg\max}~{}~{}p^{*}_{\text{hit}} (12)
s.t.i=1KαiM\displaystyle~{}~{}~{}\text{s.t.}~{}\sum_{i=1}^{K}\alpha_{i}\leq M (13)
0αimin{M,Ni},i𝒦.\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}0\leq\alpha_{i}\leq\min\{M,N_{i}\},~{}\forall i\in\mathcal{K}. (14)

The constraint (13) is for the storage size of each caching node and the constraint (14) is for the cache allocation for each category. The following key lemmas are used to solve the above problem of (12)–(14).

Lemma 1.

p1hk(αk)+(1ϵp1)qk(𝜶)p_{1}h_{k}(\alpha_{k})+(1-\epsilon-p_{1})q_{k}(\boldsymbol{\alpha}) is increasing with αk\alpha_{k}.

Proof.

In this proof, we simply use the notation of bi,n=bi,n(αi),i𝒦,n𝒩ib_{i,n}=b_{i,n}(\alpha_{i}),~{}\forall i\in\mathcal{K},~{}\forall n\in\mathcal{N}_{i}. Let αk=αk+δ\alpha^{\prime}_{k}=\alpha_{k}+\delta and δ>0\delta>0. Then, hk(αk)h~k(αk)h^{*}_{k}(\alpha^{\prime}_{k})\geq\tilde{h}_{k}(\alpha^{\prime}_{k}), where h~k(αk)\tilde{h}_{k}(\alpha^{\prime}_{k}) can be the cache hit probability within 𝒞k\mathcal{C}_{k} of any caching policy 𝐛k=(bk,1,,bk,Nk)T\mathbf{b}^{\prime}_{k}=(b^{\prime}_{k,1},\cdots,b^{\prime}_{k,N_{k}})^{T} satisfying n=1Nkbk,n=αk\sum_{n=1}^{N_{k}}b^{\prime}_{k,n}=\alpha^{\prime}_{k}. Let 𝐛k=(bk,1=bk,1+δ,bk,2=bk,2,bk,Nk=bk,Nk)T\mathbf{b}^{\prime}_{k}=(b^{\prime}_{k,1}=b^{*}_{k,1}+\delta,b^{\prime}_{k,2}=b^{*}_{k,2}\cdots,b^{\prime}_{k,N_{k}}=b^{*}_{k,N_{k}})^{T}. Then, since 0bk,110\leq b_{k,1}^{*}\leq 1 and bk,1b_{k,1}^{*} is generally much closer to zero than one when the library size of NN is large,

p1hk(αk)p1hk(αk)p1h~k(αk)p1hk(αk)\displaystyle p_{1}h^{*}_{k}(\alpha^{\prime}_{k})-p_{1}h_{k}^{*}(\alpha^{*}_{k})\geq p_{1}\tilde{h}_{k}(\alpha^{\prime}_{k})-p_{1}h_{k}^{*}(\alpha^{*}_{k})
=p1n=1Nkak,nj=0PrΦ{J=j}{(1bk,n)j(1bk,n)j}\displaystyle=p_{1}\sum_{n=1}^{N_{k}}a_{k,n}\sum_{j=0}^{\infty}\mathrm{Pr}_{\Phi}\{J=j\}\{(1-b^{*}_{k,n})^{j}-(1-b^{\prime}_{k,n})^{j}\} (15)
p1ak,1j=0PrΦ{J=j}jδ\displaystyle\approx p_{1}\cdot a_{k,1}\sum_{j=0}^{\infty}\mathrm{Pr}_{\Phi}\{J=j\}\cdot j\cdot\delta (16)

is obtained by using the first-order Taylor approximation, i.e., (1bk,n)j1jbk,n(1-b^{*}_{k,n})^{j}\approx 1-j\cdot b^{*}_{k,n}.

Since the storage size MM is fixed, the cache size allocated to all categories except for 𝒞k\mathcal{C}_{k} is MαkδM-\alpha_{k}-\delta. With small δ\delta, there exists any category uu for uku\neq k such that αuδ\alpha_{u}\geq\delta. Then, let αu=αuδ\alpha^{\prime}_{u}=\alpha^{*}_{u}-\delta and bu,1=bu,1η1b^{\prime}_{u,1}=b^{*}_{u,1}-\eta_{1}, bu,2=bu,2η2b^{\prime}_{u,2}=b^{*}_{u,2}-\eta_{2}, \cdots, bu,Nu=bu,NuηNub^{\prime}_{u,N_{u}}=b^{*}_{u,N_{u}}-\eta_{N_{u}}, where 0ηu,nbu,n0\leq\eta_{u,n}\leq b^{*}_{u,n} for all n𝒩un\in\mathcal{N}_{u} and n=1Nuηu,n=δ\sum_{n=1}^{N_{u}}\eta_{u,n}=\delta. In this case, cache allocations for other categories can remain unchanged, i.e., αi=αi\alpha^{\prime}_{i}=\alpha^{*}_{i} and bi,n=bi,nb^{\prime}_{i,n}=b^{*}_{i,n} for all i𝒦,ik,ui\in\mathcal{K},~{}i\neq k,u. Then, similar to before,

(1ϵp1)qk(𝜶)(1ϵp1)qk(𝜶)\displaystyle(1-\epsilon-p_{1})q^{*}_{k}(\boldsymbol{\alpha^{\prime}})-(1-\epsilon-p_{1})q^{*}_{k}(\boldsymbol{\alpha})\geq
1ϵp1NNk[i=1ikKn=1Nij=0Pr{J=j}×\displaystyle~{}\frac{1-\epsilon-p_{1}}{N-N_{k}}\bigg{[}\sum_{\begin{subarray}{c}i=1\\ i\neq k\end{subarray}}^{K}\sum_{n=1}^{N_{i}}\sum_{j=0}^{\infty}\mathrm{Pr}\{J=j\}\times
{(1bi,n)j(1bi,n)j}]\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\Big{\{}(1-b_{i,n}^{*})^{j}-(1-b^{\prime}_{i,n})^{j}\Big{\}}\bigg{]} (17)
1ϵp1NNkj=0JPr{J=j}jδ.\displaystyle~{}\approx-\frac{1-\epsilon-p_{1}}{N-N_{k}}\sum_{j=0}^{J}\mathrm{Pr}\{J=j\}\cdot j\cdot\delta. (18)

Since p1>1ϵp1p_{1}>1-\epsilon-p_{1} and ak,1>1NNka_{k,1}>\frac{1}{N-N_{k}}, p1hk(αk)+(1ϵp1)qk(𝜶)p1hk(αk)(1ϵp1)qk(𝜶)>0p_{1}h^{*}_{k}(\alpha^{\prime}_{k})+(1-\epsilon-p_{1})q^{*}_{k}(\boldsymbol{\alpha}^{\prime})-p_{1}h^{*}_{k}(\alpha_{k}^{*})-(1-\epsilon-p_{1})q^{*}_{k}(\boldsymbol{\alpha}^{*})>0 and the above lemma is finally proved. ∎

Algorithm 1 Greedy cache allocation algorithm
1:αi=MK\alpha^{*}_{i}=\frac{M}{K} for all i𝒦i\in\mathcal{K} and phit=0p_{\text{hit}}^{\star}=0
2:for (u,v)𝒦×𝒦\forall(u,v)\in\mathcal{K}\times\mathcal{K} and uvu\neq v do
3:     βu,v=Mαuαv\beta_{u,v}=M-\alpha^{*}_{u}-\alpha^{*}_{v}
4:     Obtain bi,ni𝒦b_{i,n}~{}\forall i\in\mathcal{K} and iu,vi\neq u,v, and n𝒩i\forall n\in\mathcal{N}_{i} according to [4, 5].
5:     for αu{max(0,βu,vNv),,min(βu,v,Nu)}\forall\alpha_{u}\in\{\max(0,\beta_{u,v}-N_{v}),\cdots,\min(\beta_{u,v},N_{u})\} do
6:         αvβu,vαu\alpha_{v}\leftarrow\beta_{u,v}-\alpha_{u}
7:         Obtain bu,nb_{u,n} and bv,mb_{v,m} n𝒩u\forall n\in\mathcal{N}_{u} and m𝒩v\forall m\in\mathcal{N}_{v} according to [4, 5].
8:         Find phitp_{\text{hit}}^{*} based on 𝜶\boldsymbol{\alpha}.
9:         if phit<phitp_{\text{hit}}^{\star}<p_{\text{hit}}^{*} then phit=phitp_{\text{hit}}^{\star}=p_{\text{hit}}^{*}
10:         end if
11:     end for
12:end for
Lemma 2.

The optimum vector 𝜶=(α1,,αK)T\boldsymbol{\alpha}^{\star}=(\alpha_{1}^{\star},\cdots,\alpha_{K}^{\star})^{T} satisfies i=1Kαi=M\sum_{i=1}^{K}\alpha_{i}^{\star}=M.

Proof.

Assume that i=1Kαi<M\sum_{i=1}^{K}\alpha^{\star}_{i}<M, then δ>0\exists\delta>0 such that i=1Kαi+δM\sum_{i=1}^{K}\alpha^{\star}_{i}+\delta\leq M and αk+δ<min{M,Nk}\alpha^{\star}_{k}+\delta<\min\{M,N_{k}\} for certain kk. Let 𝜶=(α1,,αk+δ,,αK)T\boldsymbol{\alpha}^{\prime}=(\alpha^{\star}_{1},\cdots,\alpha_{k}^{\star}+\delta,\cdots,\alpha_{K}^{\star})^{T}. According to Lemma 1, gk(𝜶,fk,ϵ)g_{k}^{*}(\boldsymbol{\alpha},f_{k},\epsilon) is increasing with αk\alpha_{k} for all k𝒦k\in\mathcal{K}. Thus, phit(𝜶)>phit(𝜶)p_{\text{hit}}^{*}(\boldsymbol{\alpha}^{\prime})>p_{\text{hit}}^{*}(\boldsymbol{\alpha}^{\star}) and it obviously leads to contradiction. ∎

According to Lemma 2, an inequality constraint (13) can be converted into the equality constraint. The problem of (12)–(14) has KK optimization parameters, and the subproblem for finding the optimal αu\alpha_{u}^{\star} and αv\alpha_{v}^{\star} is formulated as follows:

{αu,αv}\displaystyle\{\alpha^{\star}_{u},\alpha^{\star}_{v}\} =argmaxαu,αv(u,v)\displaystyle=\underset{\alpha_{u},\alpha_{v}}{\arg\max}~{}\mathcal{M}_{(u,v)} (19)
s.t.αu+αv=βu,v=Mi=1,iu,vKαi\displaystyle\text{s.t.}~{}\alpha_{u}+\alpha_{v}=\beta_{u,v}=M-\sum_{i=1,i\neq u,v}^{K}\alpha_{i} (20)
0αimin{M,Ni},i𝒦,\displaystyle~{}~{}~{}~{}~{}0\leq\alpha_{i}\leq\min\{M,N_{i}\},~{}\forall i\in\mathcal{K}, (21)

where (u,v)=gu(𝜶,fu,ϵ)+gv(𝜶,fv,ϵ)\mathcal{M}_{(u,v)}=g^{*}_{u}(\boldsymbol{\alpha},f_{u},\epsilon)+g^{*}_{v}(\boldsymbol{\alpha},f_{v},\epsilon). Since {αi}iu,v\{\alpha_{i}\}_{i\neq u,v} are fixed, αu+αv\alpha_{u}+\alpha_{v} also becomes a constant βu,v\beta_{u,v}.

A multivariable function phitp_{\text{hit}}^{*} can be optimized by iteratively optimizing the subset of variables if the convergence is guaranteed. To find 𝜶=(α1,,αK)\boldsymbol{\alpha}^{\star}=(\alpha_{1}^{\star},\cdots,\alpha_{K}^{\star}), the subproblem of (19)–(21) can be iteratively applied for all combinations of uu and vv, for u,v𝒦u,v\in\mathcal{K} and uvu\neq v. We find the maximum of the dual-variable problem of (19)–(21) in each iteration, and the sequence of the updated values of (u,v)\mathcal{M}_{(u,v)} is generated. Since this sequence is non-decreasing and the cache hit probability has a trivial upper bound of 1, i.e., phit1p_{\text{hit}}^{\star}\leq 1, the convergence of the iterative algorithm is guaranteed.

Since hk(αk)h_{k}(\alpha_{k}) and qk(𝜶)q_{k}(\boldsymbol{\alpha}) are obtained by using the bisection method [4, 5], however, the objective function of (u,v)\mathcal{M}_{(u,v)} is not in closed-form and the problem of (19)–(21) should be numerically handled. Therefore, we consider integer values for cache allocations of 𝜶\boldsymbol{\alpha} and the greedy algorithm can solve the problem with MM not very large. If caching of content partitions is not considered, i.e., only caching of the whole content is allowed, the assumption that αi\alpha_{i} is the integer number for all i𝒦i\in\mathcal{K} is reasonable. The details of the iterative algorithm to solve the problem of (12)–(14) are described in Algorithm 1.

IV Maximization of expected number of consecutive content requests

From the service provider’s perspective, it is advantageous for the user to consume as many contents as possible. As explained in Section II, the user does not request the next content with the probability of ϵ\epsilon. In addition, we assume that the user stops to consume the next content when no caching node in the vicinity of the user stores the desired content, even though the user requests the next one.

The probability of stopping to consume more is given by

pstopk=ϵ+p1(1hk(αk))+(1ϵp1)(1qk(𝜶)).p^{k}_{\text{stop}}=\epsilon+p_{1}(1-h_{k}(\alpha_{k}))+(1-\epsilon-p_{1})(1-q_{k}(\boldsymbol{\alpha})). (22)

In (22), the first term is the probability of not requesting the next content, the second and third terms are probabilities that no caching node stores the requested content when the content belongs to 𝒞k\mathcal{C}_{k} and is not in 𝒞k\mathcal{C}_{k}, respectively. Then, the expected number of consecutive content consumption is computed as

𝔼[L]\displaystyle\mathbb{E}[L] =l=1lPr{L=l}=k=1Kfkl=1l(1pstopk)lpstopk\displaystyle=\sum_{l=1}^{\infty}l\cdot\mathrm{Pr}\{L=l\}=\sum_{k=1}^{K}f_{k}\sum_{l=1}^{\infty}l\cdot(1-p^{k}_{\text{stop}})^{l}p^{k}_{\text{stop}} (23)
=k=1Kfk1pstopkpstopk.\displaystyle=\sum_{k=1}^{K}f_{k}\frac{1-p^{k}_{\text{stop}}}{p^{k}_{\text{stop}}}. (24)

Then, the optimization problem of maximizing the expected number of consecutive video consumption is as follows:

𝜶=argmax𝜶𝔼[L]\displaystyle\boldsymbol{\alpha}^{\star}=\underset{\boldsymbol{\alpha}}{\arg\max}~{}\mathbb{E}[L] (25)
s.t.i=1KαiM\displaystyle~{}~{}~{}\text{s.t.}~{}\sum_{i=1}^{K}\alpha_{i}\leq M (26)
0αimin{M,Ni},i𝒦.\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}0\leq\alpha_{i}\leq\min\{M,N_{i}\},~{}\forall i\in\mathcal{K}. (27)

Similar to Lemmas 1 and 2, pstopkp^{k}_{\text{stop}} can be proved to be increasing with αk\alpha_{k} and the inequality constraint (26) can be converted into the equality constraint, i.e., i=1Kαi=M\sum_{i=1}^{K}\alpha_{i}=M.

Again, the multivariable function 𝔼[L]\mathbb{E}[L] can be maximized by iteratively optimizing the following dual-variable subproblem:

{αu,αv}=argmaxαu,αvfupstopu+fvpstopv\displaystyle\{\alpha^{\star}_{u},\alpha^{\star}_{v}\}=\underset{\alpha_{u},\alpha_{v}}{\arg\max}\frac{f_{u}}{p^{u}_{\text{stop}}}+\frac{f_{v}}{p^{v}_{\text{stop}}} (28)
s.t.αu+αv=βu,v=Mi=1,iu,vKαi\displaystyle~{}~{}\text{s.t.}~{}\alpha_{u}+\alpha_{v}=\beta_{u,v}=M-\sum_{i=1,i\neq u,v}^{K}\alpha_{i} (29)
0αimin{M,Ni},i𝒦.\displaystyle~{}~{}~{}~{}~{}~{}~{}0\leq\alpha_{i}\leq\min\{M,N_{i}\},~{}\forall i\in\mathcal{K}. (30)

The sequence of the updated objective values in (28) is nondecreasing, and 𝔼[L]1ϵ1\mathbb{E}[L]\leq\frac{1}{\epsilon}-1 because pstopkϵp^{k}_{\text{stop}}\geq\epsilon. Thus, the algorithm which solves the problem of (25)–(27) by iteratively optimizing the dual-variable problem of (28)–(30) for all combinations of u,v𝒦u,v\in\mathcal{K} and uvu\neq v, is guaranteed to converge. The whole algorithm is the same as Algorithm 1 except that phitp_{\text{hit}}^{*} should be changed into 𝔼[L]\mathbb{E}[L] in lines 2, 9, 10.

V Numerical Results

In the subsequent simulations, N=100N=100 contents and K=5K=5 categories are considered. The global category popularity follows fi>fjf_{i}>f_{j} for i<ji<j. Caching nodes are distributed according to a Poisson point process with intensity of λ\lambda and caching nodes with distances less than d=10d=10 from the user are only considered. In addition, M=30M=30, γ=1\gamma=1, γiin=2.4\gamma_{i}^{in}=2.4, and ciin=69c_{i}^{\text{in}}=69 are used for all i𝒦i\in\mathcal{K}. We consider three different category structures as follows:

  • Case A: N1=N2=N3=N4=N5=20N_{1}=N_{2}=N_{3}=N_{4}=N_{5}=20

  • Case B: N1=35,N2=25,N3=20,N4=15,N5=5N_{1}=35,N_{2}=25,N_{3}=20,N_{4}=15,N_{5}=5

  • Case C: N1=5,N2=15,N3=20,N4=25,N5=35N_{1}=5,N_{2}=15,N_{3}=20,N_{4}=25,N_{5}=35.

Refer to caption
Figure 2: Cache allocations depending on the skewness factor
Refer to caption
Figure 3: Cache allocations depending on the number of contents in each category

In Figs. 2 and 3, plots of αk\alpha_{k} for every category are shown with λ=0.02\lambda=0.02 and ϵ=0.1\epsilon=0.1. Case A is considered in Fig. 2. As the skew factor γout\gamma^{\text{out}} grows, the probability of requesting the content in the preferred category becomes much larger than that of requesting the content in other categories. Therefore, as γout\gamma^{\text{out}} increases, more cache sizes are allocated to categories having relatively large global category popularities in Fig 2.

In Fig. 3, all plots are obtained with γout=5\gamma^{\text{out}}=5. Since all categories in Case A have the same number of contents, cache allocations of Case A depend only on global category popularity. In Case B, the category having a larger global popularity consists of more contents, therefore more cache sizes are allocated, i.e., α1\alpha_{1} in Case B becomes larger than that in Case A. Interestingly, α5\alpha_{5} in Case B is also larger than that in Case A. The reason is that N5N_{5} is the smallest in Case B, i.e., the individual content popularity within 𝒞5\mathcal{C}_{5} is the largest among all categories. Thus, even though f5f_{5} is smaller than other fif_{i} values, caching multiple contents of 𝒞5\mathcal{C}_{5} is favorable for consecutive content requests. On the other hand, in Case C, α1\alpha_{1} is smaller than α2\alpha_{2}, α3\alpha_{3} and α4\alpha_{4} because N1=5N_{1}=5 and α1\alpha_{1} should be smaller than N1N_{1}. It does not mean that an importance of caching contents in 𝒞1\mathcal{C}_{1} decreases. Rather, it becomes more important because a portion of contents to be stored in caching nodes, i.e., α1N1\frac{\alpha_{1}}{N_{1}}, is larger than other cases. By saving the cache size for 𝒞1\mathcal{C}_{1}, a larger cache size can be allocated to other categories with low global popularities compared to Case A. Thus, Figs. 2 and 3 show that the skew factor as well as the number of contents in each category have a strong impact on the proposed cache allocation rule.

Fig. 4 shows plots of cache hit probabilities obtained from the problem of (12)–(14) versus λ\lambda. In Fig. 5, the expected numbers of consecutive content consumption obtained from the problem of (25)–(27) are shown. We compared the proposed scheme with the conventional caching method optimized for one-shot content request based on popularity of individual contents in [4, 5]. The comparison scheme is named as ‘L1’ in the figures. We can easily see in both figures that the proposed scheme outperforms ‘L1’ with different valuess of γout\gamma^{\text{out}} and NiN_{i} for each category. As λ\lambda grows, i.e., as the number of caching nodes in the vicinity of the user grows, the performance improvement of the proposed scheme decreases, because the user becomes more likely to find caching nodes to deliver multiple requested contents even with ‘L1’. The performance gain of the proposed scheme over ‘L1’ is guaranteed when γout\gamma^{\text{out}} is large. Especially in Fig. 5, when ϵ=0.1\epsilon=0.1, ϵ\epsilon dominates the term in (22) representing the probability of stopping to consume contents; therefore, the advantage of the proposed scheme is not remarkable. As ϵ\epsilon becomes smaller, however, the proposed algorithm is more advantageous for consecutive content consumption than ‘L1’. Thus, the service provider can create the opportunity for users to consume more contents and to stay in the service longer by using the proposed scheme.

Refer to caption
Figure 4: The expected cache hit probabilities
Refer to caption
Figure 5: The expected number of consecutive content consumption

VI Concluding Remarks

This paper proposes two optimal cache allocation rules when users request a random number of contents consecutively. The key characteristic that users are likely to consume content highly related to each other consecutively is well captured in the proposed scheme by maximizing the cache hit probability for multiple content requests from the same category. Another cache allocation which maximizes the number of consecutive content consumption is also proposed as it related to the benefits for the service providers. The impacts of categorized contents and consecutive content requests on the cache allocation rule are shown by numerical results.

Acknowledgment

This work was supported by NSF under projects NSF CCF-1423140 and NSF CNS-1816699, and Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No.2018-0-00170, Virtual Presence in Moving Objects through 5G).

References

  • [1] X. Cheng, J. Liu, and C. Dale, “Understanding the Characteristics of Internet Short Video Sharing: A YouTube-based Measurement Study,” IEEE Trans. on Multimedia, vol. 15, no. 5, pp. 1184–1194, August 2013.
  • [2] N. Golrezaei, K. Shanmugam, A. G. Dimakis, A. F. Molisch, and G. Caire, “FemtoCaching: Wireless Video Content Delivery through Distributed Caching Helpers,” in Proc. IEEE INFOCOM, Orlando, FL, USA, 2012.
  • [3] K. Shanmugam, N. Golrezaei, A. G. Dimakis, A. F. Molisch, and G. Caire, “FemtoCaching: Wireless Content Delivery Through Distributed Caching Helpers,” IEEE Trans. on Inf. Theory, vol. 59, no. 12, pp. 8402–8413, December 2013.
  • [4] B. Blaszczyszyn and A. Giovanidis, “Optimal Geographic Caching in Cellular Networks,” in Proc. IEEE Int’l Conf. on Communications (ICC), London, UK, 2015.
  • [5] Z. Chen, N. Pappas, and M. Kountouris, “Probabilistic Caching in Wireless D2D Networks: Cache Hit Optimal Versus Throughput Optimal,” IEEE Commun. Letters, vol. 21, no. 3, pp. 584–587, March 2017.
  • [6] M. Ji, G. Caire, and A. F. Molisch, “Wireless Device-to-Device Caching Networks: Basic Principles and System Performance,” IEEE J. Sel. Areas in Commun., vol. 34, no. 1, pp. 176–189, Jan. 2016.
  • [7] K. Li, C. Yang, Z. Chen and M. Tao, “Optimization and Analysis of Probabilistic Caching in NN -Tier Heterogeneous Networks,” IEEE Trans. Wireless Commun., vol. 17, no. 2, pp. 1283-1297, Feb. 2018.
  • [8] M. Choi, J. Kim, and J. Moon, ”Wireless Video Caching and Dynamic Streaming Under Differentiated Quality Requirements,” IEEE J. Sel. Areas in Commun., vol. 36, no. 6, pp. 1245–1257, June 2018.
  • [9] D. Ko, B. Hong and W. Choi, “Probabilistic Caching Based on Maximum Distance Separable Code in a User-Centric Clustered Cache-Aided Wireless Network,” IEEE Trans. Wireless Commun., vol. 18, no. 3, pp. 1792-1804, March 2019.
  • [10] M. Cha, H. Kwak, P. Rodriguez, Y. Ahn and S. Moon, “Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems,” IEEE/ACM Trans. Network., vol. 17, no. 5, pp. 1357-1370, Oct. 2009.
  • [11] R. Zhou, S. Khemmarat, and L. Gao, “The impact of YouTube recommendation system on video views” in Proc. ACM IMC, New York, NY, USA, 404-410, 2010.
  • [12] M. Choi, D. Kim, D.-J. Han, J. Kim and J. Moon, “Probabilistic Caching Policy for Categorized Contents and Consecutive User Demands”, IEEE Int’l Conf. on Communications (ICC), May 2019.
  • [13] M. Lee, A. F. Molisch, N. Sastry and A. Raman, “Individual Preference Probability Modeling and Parameterization for Video Content in Wireless Caching Networks,” IEEE/ACM Trans. Network., 2019.