This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Efficient counterfactual estimation in
semiparametric discrete choice models:
a note on Chiong, Hsieh, and Shum (2017)

Grigory Franguridi Department of Economics, University of Southern California. Email: [email protected]
Abstract

I suggest an enhancement of the procedure of Chiong, Hsieh, and Shum (2017) for calculating bounds on counterfactual demand in semiparametric discrete choice models. Their algorithm relies on a system of inequalities indexed by cycles of a large number MM of observed markets, and hence seems to require computationally infeasible enumeration of all such cycles. I show that such enumeration is unnecessary because solving the “fully efficient” inequality system exploiting cycles of all possible lengths K=1,,MK=1,\dots,M can be reduced to finding the length of the shortest path between every pair of vertices in a complete bidirected weighted graph on MM vertices. The latter problem can be solved using the Floyd–Warshall algorithm with computational complexity O(M3)O\left(M^{3}\right), which takes only seconds to run even for thousands of markets. Monte Carlo simulations illustrate the efficiency gain from using cycles of all lengths, which turns out to be positive, but small.

1 The Algorithm

The family of inequalities for the linear program in Chiong et al. (2017) takes the form

(𝜹l1𝜹M+1)𝐬M+1\displaystyle\left(\bm{\delta}^{l_{1}}-\bm{\delta}^{M+1}\right)^{\prime}\mathbf{s}^{M+1} 1(D1S1)1=k=1K1(𝜹lk𝜹lk+1)𝐬lk\displaystyle\leq-1^{\prime}(D_{1}\circ S_{1})1=\sum_{k=1}^{K-1}\left(\bm{\delta}^{l_{k}}-\bm{\delta}^{l_{k+1}}\right)^{\prime}\mathbf{s}^{l_{k}}
=k=1K2(𝜹lk𝜹lk+1)𝐬lk+(𝜹lK1𝜹M+1)𝐬lK1\displaystyle=\sum_{k=1}^{K-2}\left(\bm{\delta}^{l_{k}}-\bm{\delta}^{l_{k+1}}\right)^{\prime}\mathbf{s}^{l_{k}}+\left(\bm{\delta}^{l_{K-1}}-\bm{\delta}^{M+1}\right)^{\prime}\mathbf{s}^{l_{K-1}} (1)

for every cycle l1,,lK,l1l_{1},\dots,l_{K},l_{1} from ={1,,M}\mathcal{M}=\{1,\dots,M\} with lK=M+1l_{K}=M+1. Here 𝜹mJ\bm{\delta}^{m}\in\mathbb{R}^{J} and 𝐬mΔJ\mathbf{s}^{m}\in\Delta_{J} are the mean utility vector and the market share vector for market mm\in\mathcal{M}, 𝜹M+1\bm{\delta}^{M+1} is the mean utility vector for the counterfactual market M+1M+1 and 𝐬M+1\mathbf{s}^{M+1} is the choice variable – the counterfactual market share to be bounded. Denote the inequality (1) by ineq(l1,,lK1)\text{ineq}(l_{1},\dots,l_{K-1}).

The number of possible cycles in a complete graph is exponential in MM and is of order 102010^{20} already for M=22M=22111See sequence A119913 in the Online Encyclopedia of Integer Sequences. Therefore, simple enumeration is computationally infeasible and Chiong et al. (2017) resort to using a (tiny) subset of inequalities (1) corresponding to cycles of length K=2K=2. I will show that enumeration is not necessary for this problem and that the full system of inequalities is easy to construct using graph optimization.

Define GG to be a complete bidirected weighted graph on vertices \mathcal{M} with weight wij=(𝜹i𝜹j)𝐬iw_{ij}=\left(\bm{\delta}^{i}-\bm{\delta}^{j}\right)^{\prime}\mathbf{s}^{i} assigned to edge (i,j)×(i,j)\in\mathcal{M}\times\mathcal{M}. Since the mean utilities and shares for markets \mathcal{M} are identified via cyclic monotonicity, it holds that

k=1K(𝜹lk𝜹lk+1)𝐬lk0\displaystyle\sum_{k=1}^{K}\left(\bm{\delta}^{l_{k}}-\bm{\delta}^{l_{k+1}}\right)^{\prime}\mathbf{s}^{l_{k}}\geq 0

for every path l1,,lK,lK+1=l1l_{1},\dots,l_{K},l_{K+1}=l_{1} from \mathcal{M}, and hence GG has no negative cycles.

Note that the left-hand side of the inequality (1) depends on 𝜹l1\bm{\delta}^{l_{1}} and so the inequalities for different values of l1l_{1} are generally non-parallel. Hence, for each l1l_{1}, we can minimize the right-hand side over all possible paths l1,,lK1l_{1},\dots,l_{K-1} starting from l1l_{1} to get the sharpest inequality.

This minimization problem resembles the problem of finding the shortest path from vertex l1l_{1} in the graph GG, but is not equivalent to it because of the presence of the second term in (1), corresponding to the last edge in the path.

I suggest iterating through M(M1)M(M-1) ordered pairs (l1,lK1)(l_{1},l_{K-1}) of different vertices from \mathcal{M} and, for each such pair, finding the shortest path that connects l1l_{1} and lK1l_{K-1} in graph GG. This can be accomplished using the Floyd–Warshall algorithm, which has polynomial computational complexity O(M3)O\left(M^{3}\right) for graphs with no negative cycles such as GG. The length of the shortest path between l1l_{1} and lK1l_{K-1} corresponds to the sharpest of the family of inequalities

l1,lK1={ineq(l1,,lK1):l2,,lK2 s.t. l1,,lK1 is a path from l1 to lK1},\displaystyle\mathcal{F}_{l_{1},l_{K-1}}=\left\{\text{ineq}(l_{1},\dots,l_{K-1}):\,\,\,l_{2},\dots,l_{K-2}\in\mathcal{M}\text{ s.t. }l_{1},\dots,l_{K-1}\text{ is a path from }l_{1}\text{ to }l_{K-1}\right\},

from which we can then pick the sharpest one over lK1l_{K-1}.

I suggest the following algorithmic implementation:

  1. 1.

    Run the Floyd–Warshall algorithm on graph GG, obtaining a matrix DGD_{G} of shortest path lengths between every pair of vertices. Set the diagonal elements (DG)ii=0(D_{G})_{ii}=0 for all ii\in\mathcal{M}.

  2. 2.

    For every l1l_{1}\in\mathcal{M},

    • for every ll\in\mathcal{M}, set rhs(l1,l)=(DG)l1,l+(𝜹l𝜹M+1)𝐬l\text{rhs}(l_{1},l)=(D_{G})_{l_{1},l}+\left(\bm{\delta}^{l}-\bm{\delta}^{M+1}\right)^{\prime}\mathbf{s}^{l},

    • set l=argminlrhs(l1,l)l^{*}=\arg\min_{l\in\mathcal{M}}\text{rhs}(l_{1},l),

    • save the inequality ineql1\text{ineq}_{l_{1}}^{*} with the right hand side rhs(l1,l)\text{rhs}(l_{1},l^{*}).

  3. 3.

    The resulting system of MM inequalities ineq1,,ineqM\text{ineq}_{1}^{*},\dots,\text{ineq}_{M}^{*} is equivalent to the system (1) with all cycles exhausted.

2 Monte Carlo Simulation

I extend the Monte Carlo exercise of Chiong et al. (2017) and compare performance of their original procedure using only 2-cycles (denoted CHS) with the proposed procedure exhausting all possible cycles (denoted All cyc.).

I use the following two model specifications: one is multinomial logit as in Chiong et al. (2017) and the other is multinomial probit with the same DGP parameters as the logit and with the covariance matrix of the error term

Σ=(10.70.30.710.30.30.31).\Sigma=\begin{pmatrix}1&-0.7&0.3\\ -0.7&1&0.3\\ 0.3&0.3&1\end{pmatrix}.

A counterfactual of interest is the increase of price of good jj by 1% (denoted pjp_{j}\uparrow), for j=1,2,3.j=1,2,3. As can be seen from Tables 1 and 2, the fully efficient procedure always dominates the CHS procedure both in terms of widths of the bounds and their standard errors. Although the gain is minimal and is unlikely to be relevant in practice, I suggest using the fully efficient procedure since it bears no extra computational costs (average computational time is less than 20 seconds for M=1000M=1000 and is less than 3 seconds for M{200,500}M\in\{200,500\}).

M=200M=200 M=500M=500 M=1000M=1000
s1s_{1} s2s_{2} s3s_{3} s1s_{1} s2s_{2} s3s_{3} s1s_{1} s2s_{2} s3s_{3}
p1p_{1}\uparrow CHS 0.0478 0.0835 0.0932 0.0074 0.0373 0.0367 0.0070 0.0376 0.0376
(0.0136) (0.0267) (0.0290) (0.0022) (0.0129) (0.0125) (0.0030) (0.0117) (0.0118)
All cyc. 0.0454 0.0794 0.0878 0.0071 0.0351 0.0345 0.0066 0.0355 0.0355
(0.0126) (0.0243) (0.0255) (0.0021) (0.0112) (0.0108) (0.0026) (0.0103) (0.0104)
p2p_{2}\uparrow CHS 0.0890 0.0575 0.0978 0.0131 0.0245 0.0289 0.0101 0.0255 0.0289
0.0286 0.0217 0.0306 0.0046 0.0118 0.0105 0.0031 0.0075 0.0082
All cyc. 0.0837 0.0551 0.0915 0.0123 0.0230 0.0270 0.0096 0.0247 0.0278
(0.0256) (0.0197) (0.0271) (0.0040) (0.0098) (0.0090) (0.0028) (0.0069) (0.0075)
p3p_{3}\uparrow CHS 0.0844 0.0833 0.0626 0.0125 0.0314 0.0268 0.0097 0.0263 0.0225
(0.0306) (0.0279) (0.0254) (0.0041) (0.0085) (0.0073) (0.0032) (0.0092) (0.0086)
All cyc. 0.0799 0.0790 0.0597 0.0119 0.0299 0.0256 0.0092 0.0250 0.0214
(0.0274) (0.0256) (0.0226) (0.0036) (0.0077) (0.0066) (0.0029) (0.0083) (0.0077)
Table 1: Widths of bounds on counterfactual market shares and their standard errors. Bounds always cover true (logit) counterfactuals. Number of simulations is 100.
M=200M=200 M=500M=500 M=1000M=1000
s1s_{1} s2s_{2} s3s_{3} s1s_{1} s2s_{2} s3s_{3} s1s_{1} s2s_{2} s3s_{3}
p1p_{1}\uparrow CHS 0.0050 0.0051 0.0003 0.0167 0.0516 0.0523 0.0006 0.0060 0.0061
(0.0015) (0.0015) (0.0002) (0.0056) (0.0161) (0.0156) (0.0001) (0.0018) (0.0018)
All cyc. 0.0048 0.0049 0.0003 0.0160 0.0487 0.0496 0.0006 0.0057 0.0058
(0.0013) (0.0013) (0.0002) (0.0052) (0.0140) (0.0139) (0.0001) (0.0016) (0.0016)
p2p_{2}\uparrow CHS 0.0036 0.0035 0.0003 0.0271 0.0333 0.0434 0.0012 0.0048 0.0051
(0.0017) (0.0017) (0.0002) (0.0082) (0.0104) (0.0133) (0.0004) (0.0012) (0.0012)
All cyc. 0.0035 0.0033 0.0003 0.0256 0.0318 0.0412 0.0011 0.0046 0.0049
(0.0016) (0.0015) (0.0002) (0.0075) (0.0095) (0.0121) (0.0003) (0.0011) (0.0011)
p3p_{3}\uparrow CHS 0.0051 0.0051 0.0000 0.0265 0.0413 0.0332 0.0013 0.0063 0.0063
(0.0021) (0.0021) (0.0000) (0.0095) (0.0137) (0.0128) (0.0004) (0.0017) (0.0017)
All cyc. 0.0049 0.0049 0.0000 0.0248 0.0389 0.0315 0.0012 0.0059 0.0059
(0.0020) (0.0020) (0.0000) (0.0084) (0.0122) (0.0112) (0.0004) (0.0015) (0.0016)
Table 2: Widths of bounds on counterfactual market shares and their standard errors. Bounds always cover true (probit) counterfactuals. Number of simulations is 100.

References

  • Chiong et al. (2017) Chiong, K., Y.-W. Hsieh, and M. Shum (2017): “Counterfactual estimation in semiparametric discrete choice models,” Unpublished manuscript.