Ergodic inventory control with diffusion demand and general ordering costs
Abstract
In this work, we consider a continuous-time inventory system where the demand process follows an inventory-dependent diffusion process. The ordering cost of each order depends on the order quantity and is given by a general function, which is not even necessarily continuous and monotone. By applying a lower bound approach together with a comparison theorem, we show the global optimality of an policy for this ergodic inventory control problem.
Keywords: stochastic inventory model, general ordering costs, diffusion process, policy, impulse control.
1 Introduction
This paper is a sequel to [6], which investigates a continuous-time inventory system with a Brownian demand process and a quantity-dependent setup cost. In this setting, an replenishment policy turns out to be optimal under the average cost criterion. In [6], the setup cost function is only required to be a nonnegative, bounded, and lower semicontinuous function of the order quantity. It is necessary to consider such a general ordering cost structure, because in practice, expenses arising from administration and transportation may not be continuous in the order quantity. Furthermore, general ordering cost structure was studied by [12, 13] in inventory models with deterministic demand and renewal demand, respectively.
In this work, we establish the global optimality of an policy for ergodic inventory control with an inventory-dependent diffusion demand process under a general ordering cost structure. One may refer to [3, 2] for state-dependent inventory models and their applications. Ergodic inventory control with a diffusion demand process has been studied in two recent papers by Helmes et al. [7, 8]. More specifically, an policy is proved to be optimal in a subset of admissible policies in [7], in which the authors assume the ordering cost is continuous with respect to the order quantity. In [8], the authors proposed a weak convergence approach, which allow them to further show the global optimality of an policy among all admissible policies. Our work complements their papers by allowing for a more general ordering cost function that may have discontinuities.
The main results in this paper provide a rigorous justification for the following intuitive interpretation of the optimality of policies for ergodic inventory control: If the demand process has almost sure continuous sample paths, the inventory administrator is allowed to replenish inventory at any level as she wants. Moreover, if the demand process is also Markovian, the distribution of future demand can be determined based on the current state (inventory level). In this case, an policy would be optimal to minimize the average cost, even a general ordering cost function is involved. Such a simple optimal policy stands in stark contrast with optimal ergodic control in discrete-time inventory models: the inventory administrator is only allowed to replenish the inventory at the start of each period, the reorder level would be different from period to period. Thus, if the setup cost function is not a constant, this dynamic optimization problem would be generally difficult to tackle (see, e.g., [5, 4, 18]).
The remainder of this paper is organized as follows. The diffusion inventory model is introduced and the main results are presented in Section 2. An policy is selected and is proven to be the best one in a subset of admissible policies by a lower bound theorem in Section 3. A comparison theorem is provided to establish the global optimality of the policy among all admissible policies in Section 4. Finally, Section 5 concludes the study.
2 Problem Formulation and Main Results
2.1 Diffusion Inventory Model
Consider a single-item inventory model, where the inventory level process is governed by
(1) |
where denotes the initial inventory level, and represent the cumulative demand process and cumulative order quantity up to time , respectively. The inventory-dependent demand process is represented as
where denotes a standard Brownian motion on . We assume that the drift coefficient and the diffusion coefficient satisfy the following conditions.
Assumption 1.
-
is continuously differentiable, nondecreasing with and .
-
is continuous, and , where are two finite constants.
Without any replenishment, the inventory level process turns out to be a diffusion process given by
(2) |
For later use, we denote the scale function of by
where is an arbitrary real number, and the speed measure of by
We represent the ordering policy by a cumulative order process , which is called admissible if it satisfies the three conditions as follows: (i) is nonnegative for all ; (ii) The sample paths of are nondecreasing and right-continuous with left limits (RCLL); (iii) is adapted.
In this work, the ordering cost function is assumed to satisfy the following conditions.
Assumption 2.
The function is subadditive111A function is subadditive if for , . and lower semicontinuous222A function is lower semicontinuous if for each . with and .
The ordering cost function satisfies the condition above is very general, and it is not even necessarily continuous (cf. [7, 9] for continuous ordering cost) and monotone. In particular, it includes the classical linear cost (cf. [16, 10]), all unit quantity discount cost (cf. [1]), incremental quantity discounted cost (cf. [14, 17]), and quantity-dependent setup cost (cf. [5, 4]) as special cases.
Since , we only need to consider impulse control policies, which can be specified by with that and denote the time and the amount of th order, respectively. For convenience, we assume that and , i.e., no order is placed when . Then, an admissible policy can be denoted as , where . We define as the set including all such admissible policies.
In addition, let represent the holding and shortage cost rate for inventory level .
Assumption 3.
The function is polynomially bounded, convex, continuously differentiable except at with . Further, if , and if .
Remark 1.
We need to find an admissible policy to minimize the following long-run average cost:
(3) |
where .
2.2 Main Results
Under an policy, a cycle is defined as the duration from to . Then, the controlled process can be regarded as a regenerative process. Using the regenerative process theory, we have
where is the duration time of one cycle. Under Assumptions 1 and 3, we have
(4) |
see Proposition 2.6 in [7]. Therefore,
(5) |
Under policy, for any initial state , level can be reached in finite expected time due to strictly positive demand drift. Actually, is the average cost which is independent of the initial state , i.e., for any . In the following lemma, we claim the existence of the best policy in minimizing .
Our main results are as follows.
Theorem 1.
3 Optimality of the (s,S) Policy in A Subset
In this section, by a lower bound theorem, we show that the policy is the best one in a subset of admissible policies. Specifically, in Proposition 1, we show that if some function with certain properties and a constant satisfy the lower bound conditions (7)-(9), then the cost under any policy in a subset is larger than . We construct a function and in Proposition 2 check that and satisfy all lower bound conditions. Thus, is a lower bound of the cost under any , i.e., policy is optimal in . Finally, in Proposition 3, we show that is large enough to include a class of admissible policies with order-up-bounds.
Let . The following proposition provides a lower bound theorem. See Proposition 2 in [6] for a similar proof.
Proposition 1 (Lower Bound Theorem).
Suppose Assumption 3 holds. Let be a real-value function with absolutely continuous , and let be a positive number. If
(7) |
with
(8) | |||
(9) |
then we have for each and each , where consists of those policies such that their resulting inventory process satisfying
(10) | ||||
(11) | ||||
(12) |
We next construct a function, embodied by , which together with , satisfies all conditions in Proposition 1. Define
Note that and satisfy
(13) |
Now we are ready to construct the function as follows.
(16) |
Next, we show that and satisfy conditions (7)-(9), and then Proposition 1 implies that for , i.e., policy is optimal in .
Proof of Proposition 2.
We will claim that satisfies all conditions of Proposition 1. First, defined in (16) is continuously differentiable in whole and exists except at , thus is continuously differentiable with absolutely continuous .
To the end, we study how large is the subset . We define another subset of admissible policies as follows and then show that it is included in . For , let
i.e., under , the inventory level after ordering at any ordering time does not exceed level . Let
We will show that . To achieve that, we first provide some properties of which will be used in proving Proposition 3.
Lemma 3.
Proof of Proposition 3.
For any given , i.e., for some , we need to show that the controlled process under as well as function defined in (16) satisfy conditions (10)-(12).
Let ( is the initial level) and be the reflected process with lower barrier and any initial level , then it follows from Remark 3.3 in [7] that has a stationary distribution with density
(19) |
Note that the boundedness of and in Assumption 1 implies that
(20) |
Denote as the reflected process with lower barrier and a initial level given by a random variable with distribution (19). Then, for any , has the same distribution with density (19). We next show
(21) |
At time zero, it follows from that a.s.. Also, at any ordering time of , and imply that a.s.. Furthermore, during any two successive ordering times, the process cannot move above through diffusion on each sample path since once and become same at certain time, they will keep same thereafter until the next ordering time. Thus, (21) holds.
We first prove (10). In fact, we have
It follows from (16) that the first two terms are finite. For the last term, we have
where the first inequality is from (17)-(18), the second inequality is from (21) and a.s., and the equality holds because has the same distribution with density (19) for any . Therefore, we have proven (10).
4 Proof of Theorem 1
We, in this section, will prove that the policy is optimal among all admissible policies (i.e., Theorem 1) by a comparison theorem. Specifically, for any admissible policy , if we can find a sequence satisfying
(23) |
then the optimal policy in must be optimal in . From Propositions 2 and 3, we have proven that the policy defined in (6) is the best one in . To eventually establish the global optimality of the policy in , what remains is to construct a sequence of for each and prove (23).
For any given admissible policy (with as the controlled inventory process under policy ), the construction of the sequence of policies is same as that in [6]. However, a more general argument is required to tackle the technical issues arising from the general diffusion demand process. Let denote the total order amount of policy in , and be the resulting inventory process under , i.e.,
(24) |
where . We define the jumps of as follows; see [6].
-
for satisfying and ;
-
for satisfying , , and ;
-
for satisfying , , and ;
-
for satisfying .
Proposition 4 (Comparison Theorem).
Proof of Proposition 4.
To prove (23), we need to compare the holding/shortage cost and ordering cost under and .
Consider the holding/shortage cost. It follows from the construction of by -, we can easily have that on each sample path,
(25) |
By (25) and the properties of holding/shortage cost function in Assumption 3, we have that the holding/shortage cost incurred under is no greater than that under .
Consider the ordering cost. We first show some properties of function . Since is a subadditive function in , the limit must exist and (cf. Theorem 16.2.9 in [11]). Let
Then we have
(26) |
Thus, can be treated as the proportional cost and as the setup cost for an order with quantity , and the cumulative ordering cost up to time under can be rewritten as .
We next consider the proportional cost. The cumulative proportional costs up to time under and are and , respectively. We claim that for any ,
(27) |
Suppose (27) does not hold, i.e.,
(28) |
which, implies that we can find a subsequence of ordering times satisfying
(29) |
For this subsequence, we have
(30) |
Thus, it follows from (28)-(30) that there must exists a such that
(31) |
Moreover, from (25) and the fact that is non-decreasing (see Assumption 1(a)), we have
(32) |
where . Furthermore, it follows from (1) and (24) that and , which, together with (31)-(32), imply that
It remains to consider the setup cost. For function , we can further claim
(33) |
In fact, it follows from the second part in (26) that for any , there is a such that for all . Further, there exits an such that for all . Therefore, for all ,
Since is arbitrary, (33) holds.
Now we consider the setup cost incurred by the orders under policy in . For the order in , and incur the same setup cost.
Consider the orders under policy in . Let and denote any two consecutive ordering times with . Let . Recall the definition of in (24), we have
which, together with , imply
where the first two equalities follow from . Let . It follows from the second part in (4) and Assumption 1 that
(34) |
Let be the number of ordering in under up to time . Since , we have
Now consider the orders under in . Let and denote any two consecutive ordering times with . In this case, we claim that there must exist some satisfying . If , we must have and then choose . If , assume that such does not exist in , then the cases in , , and can not happen in . This implies , contradicting with the fact . Let . Using the same derivations as in (34), we have
Let be the number of ordering in under in . Since , we have
5 Concluding Remarks
In this paper, we used a two-step approach to prove the global optimality of an policy in an ergodic inventory control problem with inventory-dependent diffusion demand and general ordering costs. Specifically, we first applied a lower bound theorem to show the optimality of the selected policy in a subset of admissible policies, and then used a comparison theorem to establish the global optimality among all admissible policies.
Appendix A Proof of Lemma 1
Let
(35) |
It follows from Assumptions 1 and 3 (as well as Remark 1 (a)) that the conditions in Lemma 2.1 in [8] hold. Then, we have
(36) | ||||
which, together with the non-negativity of in Assumption 2, imply
Thus, we can find a finite positive number satisfying
(37) |
Let , then can be rewritten as
From (see Assumption 2), we have , thus we can find a finite positive number such that (37) becomes
Since is continuous in , there exists an such that
Further, since is low semicontinuous and other parts in is continuous in , by the extreme value theorem (see Theorem B.2 in [15]), there exists a such that
Let and , then we complete the proof. ∎
Appendix B Proof of Lemma 2
We show the existence of satisfying (14) and (15) as follows: First, in part (), we show that there exists an such that (14) holds for any ; and in part (), we show that we can find an such that for any . Then, we let , then both (14) and (15) hold.
() First, by Assumption 1, we have
Similarly, we have
Therefore, by L’ Hôpital’s rule, we have
which yields that we can find an with such that
(38) |
Also, by L’ Hôpital’s rule, we have
(39) |
which yields that there exists an with such that
(40) |
In addition, it follows from (35) and (36) that
Then, there exists an with such that
(41) |
Now we can show that (14) holds for any . If , we have
where the inequality follows from (6). Next, we prove the case when in three subcases: , , and . If , we have
where the first inequlity follow the non-negativity of in Assumption 2, and the second inequality follows (38) with . If , we have
the the last inequality is derived from (38) and (41) with . If , we have
where the last inequality holds due to (38), (40), and (41).
() To prove that we can find an such that for any , we will claim that
It follows from the convexity of in Assumption 3 that there exist and such that for all ,
(42) |
Then, for , we rewrite as
where the second equality holds because
and in the last equality,
If we can prove
(43) |
then it follows from the positiveness of and the boundedness of and (see Assumption 1) that
Thus, it remains to prove (43).
First, for , we rewrite as
Since (Remark 1 ()), there exist a such that for
(44) |
which, together with the polynomial boundedness of (Assumption 3) , implies
Thus,
is a finite number. Furthermore, the boundedness of and in Assumption 3 implies
Therefore, we have
Second, (44) and the boundedness of and imply that for ,
Therefore,
where the equality follows from the polynomial boundedness of . Thus, we have
Appendix C Proof of Lemma 3
To prove (17), we only need to prove
(45) |
which yields , and then (17) holds. We next prove (45). First, we have
where the inequality follows from for (Assumption 3) and the boundedness of and in Assumption 1. This, together with (39) and the definition of in (16), implies that
Finally, (18) can be implied by the polynomial boundedness of . ∎
References
- [1] N. Altintas, F. Erhun, and S. Tayur. Quantity discounts under demand uncertainty. Management Science, 54(4):777–792, 2008.
- [2] Opher Baron, Oded Berman, and David Perry. Shelf space management when demand depends on the inventory level. Production and Operations Management, 20(5):714–726, 2011.
- [3] A. Cadenillas, P. Lakner, and M. Pinedo. Optimal control of a mean-reverting inventory. Operations Research, 58(6):1697–1710, 2010.
- [4] O. Caliskan-Demirag, Y. Chen, and Y. Yang. Ordering policies for periodic-review inventory systems with quantity-dependent fixed costs. Operations Research, 60(4):785–796, 2012.
- [5] X. Chao and P. Zipkin. Optimal policy for a periodic-review inventory system under a supply capacity contract. Operations Research, 56(6):887–896, 2008.
- [6] S. He, D. Yao, and H. Zhang. Optimal ordering policy for inventory systems with quantity-dependent setup costs. Mathematics of Operations Research, 42(4):979–1006, 2017.
- [7] K. L. Helmes, H. Stockbridge, and C. Zhu. Continuous inventory models of diffusion type: Long-term average cost criterion. The Annals of Applied Probability, 27(3):1831–1885, 2017.
- [8] K. L. Helmes, H. Stockbridge, and C. Zhu. A weak convergence approach to inventory control using a long-term average criterion. Advances of Applied Probability, 50(4):1032–1074, 2018.
- [9] K. L. Helmes, R. H Stockbridge, and C. Zhu. A weak convergence approach to inventory control using a long-term average criterion. Advances in Applied Probability, 50(4):1032–1074, 2018.
- [10] D. L. Iglehart. Optimality of policies in the infinite horizon dynamic inventory problem. Management Science, 9(2):259–267, 1963.
- [11] Marek Kuczma. An Introduction to the Theory of Functional Equations and Inequalities. Birkhuser, Berlin, Second edition, 2009.
- [12] Sandun Perera, Ganesh Janakiraman, and Shun-Chen Niu. Optimality of () policies in EOQ models with general cost structures. International Journal of Production Economics, 187:216–228, 2017.
- [13] Sandun Perera, Ganesh Janakiraman, and Shun-Chen Niu. Optimality of () inventory policies under renewal demand and general cost structures. Production and Operations Management, 27(2):368–383, 2018.
- [14] E. Porteus. On the optimality of generalized policies. Management Science, 17(7):411–426, 1971.
- [15] M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley Son, Inc., New York, 1994.
- [16] H. Scarf. The optimality of policies in the dynamic inventory problem. Mathematical Methods in the Social Sciences (P. Suppes, K. Arrow, and S. Karlin, eds.). Stanford University Press, Stanford, CA, USA, 1960.
- [17] D. Yao, X. Chao, and J. Wu. Optimal control policy for a Brownian inventory system with concave ordering cost. Journal of Applied Probability, 52(4):909–925, 2015.
- [18] Liqing Zhang and Sıla Çetinkaya. Stochastic dynamic inventory problem under explicit inbound transportation cost and capacity. Operations Research, 65(5):1267–1274, 2017.