Verifiable identification condition for nonignorable nonresponse data with categorical instrumental variables
Abstract
We consider a model identification problem in which an outcome variable contains nonignorable missing values. Statistical inference requires a guarantee of the model identifiability to obtain estimators enjoying theoretically reasonable properties such as consistency and asymptotic normality. Recently, instrumental or shadow variables, combined with the completeness condition in the outcome model, have been highlighted to make a model identifiable. In this paper, we elucidate the relationship between the completeness condition and model identifiability when the instrumental variable is categorical. We first show that when both the outcome and instrumental variables are categorical, the two conditions are equivalent. However, when one of the outcome and instrumental variables is continuous, the completeness condition may not necessarily hold, even for simple models. Consequently, we provide a sufficient condition that guarantees the identifiability of models exhibiting a monotone-likelihood property, a condition particularly useful in instances where establishing the completeness condition poses significant challenges. Using observed data, we demonstrate that the proposed conditions are easy to check for many practical models and outline their usefulness in numerical experiments and real data analysis.
keywords:
missing not at random; nonignorable missingness; identification; instrumental variable; exponential family1 Introduction
There has been a rapidly growing movement to utilize all the available data that may explicitly, even implicitly, contain missing values, such as causal inference (Imbens and Rubin, 2015) and data integration (Yang and Kim, 2020; Hu et al., 2022). For such datasets, appropriate analysis of missing data is indispensable to correct selection bias owing to the missingness. In recent years, analysis of missing data under missing at random (MAR) assumption (Little and Rubin, 2019) has gradually matured (Robins et al., 1994; Kim and Shao, 2021). Although model identifiability is one of the most fundamental conditions in constructing the asymptotic theory, removing the MAR assumption makes statistical inference drastically difficult, especially in model identification (Miao et al., 2016). Estimation with unidentifiable models may provide multiple solutions that have exactly the same model fitting. Several researchers have considered giving sufficient conditions for the model identification under missing not at random (MNAR).
Constructing observed likelihood consists of two distributions: (R) response mechanism and (O) outcome distribution (Kim and Shao, 2021). Miao et al. (2016) considered identification condition with Logistic, Probit, and Robit (cumulative distribution function of -distribution) models for (R) and normal and (mixture) distributions for (O). Cui et al. (2017) assumed Logistic, Probit, and cLog-log models for (R) and the generalized linear models for (O). These studies depend heavily on the model specification of both (R) and (O). Wang et al. (2014) introduced a covariate called instrument or shadow variable and demonstrated that the use of the instrument could considerably relax conditions on (R) and (O). For example, (O) requires only the monotone-likelihood property, which includes a variety of models, such as the generalized linear model. Tang et al. (2003) and Miao and Tchetgen (2018) derived conditions for model identifiability without postulating any assumptions on (R) with the help of the instrument. Miao et al. (2019) further relaxed the assumption under an assumption referred to as the completeness condition on (R) (D’Haultfœuille, 2010, 2011). For example, the generalized linear model with continuous covariates satisfies the completeness condition. To the best of our knowledge, this combination of an instrument on (R) and completeness on (O) is the most general condition for model identification and has been accepted in numerous studies (Zhao and Ma, 2022; Yang et al., 2019).
Generally, assumptions on (O) rely on the distribution of the complete data, which is untestable from observed data. Recently, modeling (O’) the observed or respondents’ outcome model, instead of (O), has been used to relax the subjective assumption (Miao et al., 2019; Riddles et al., 2016). However, the observed likelihood with (R) and (O’) involves an integration that makes the identification problem intractable. Morikawa and Kim (2021) and Beppu et al. (2021) established that the integration can be computed explicitly with Logistic models for (R) and generalized linear models for (O’) and derived identification condition. For general response mechanisms and respondents’ outcome distributions, the model identification remains an open question. Furthermore, when the instrument is categorical such as smoking history and sex, the completeness condition is not available. For example, Ibrahim et al. (2001) considered a study on the mental health of children in Connecticut and used the parents’ report of the psychopathology of the child as the binary instrument.
In this paper, we consider an identification problem with an instrument for (R) and (O’) that satisfies the monotone-likelihood ratio property. Note that although our model setup is similar to Wang et al. (2014), we can check the validity of (O’) with observed data, for example, by using the information criteria such as AIC and BIC. Furthermore, we can use semiparametric/nonparametric methods for modeling both (O’) and (R).
The rest of this paper is organized as follows. Section 2 introduces the notation and defines model identifiability. Section 3 derives the proposed identification condition. We demonstrate the effects of identifiability via a limited numerical study in Section 4. Moreover, application to real data is presented in Section 5. Finally, concluding remarks are summarized in Section 6. All the technical proofs are relegated to the Appendix.
2 Basic setup
2.1 Observed likelihood
Let be independent and identically distributed samples from a distribution of , where is a fully observed covariate vector, is an outcome variable subject to missingness, and is a response indicator of being if is observed (missing). We use the generic notation and for the marginal density and conditional density, respectively. For example, is the marginal density of , and is the conditional density of given . We model the MNAR response mechanism and consider its identification. The observed likelihood is defined as
(1) |
We say that this model is identifiable if parameters in (1) are identified, which is equivalent to parameters in being identified. This identification condition is essential even for semiparametric models such as an estimator defined by moment conditions (Morikawa and Kim, 2021). However, simple models can be easily unidentifiable. For example, Example 1 in Wang et al. (2014) presented an unidentifiable model when the outcome model is normal, and the response mechanism is a Logistic model.
There is an alternative way to express the relationship between and . A disadvantage of modeling is its subjective assumption on the distribution of complete data, not of observed data. In other words, if we made assumptions about and ensured its identifiability, we could not verify the assumptions using the observed data. By contrast, this issue can be overcome by modeling because is the outcome model for the observed data, and we can check its validity using ordinal information criteria such as AIC and BIC. Therefore, we model and consider the identification condition in Section 3. Hereafter, we assume two parametric models and , where and are parameters of the outcome and response models, respectively. Although our method requires two parametric models, the class of identifiable models is very large. For example, it can include semiparametric outcome models for and general response models other than Logistic models, as discussed in Example 3.7.
2.2 Estimation
We present a procedure of parameter estimation based on parametric models of and . Let be the maximum likelihood estimator of . The observed likelihood (1) yields to the mean score equation for (Kim and Shao, 2021):
where . By using Bayes’ formula , the mean score can be written as
where
To compute the two integrations in , we can use the fractional imputation (Kim, 2011). As described in Riddles et al. (2016), the EM algorithm is also applicable.
3 Identifiability
3.1 Definition of identification
Recall that the identification condition in (1) is for parameters in . As seen in Section 2.2, the conditional density is represented by and by Bayes’ formula. Thus, using the formula, identification with these models changes to parameters in , where
(2) |
Strictly speaking, the identification condition is with probability implies that . Generally, the integral in the denominator of (2) does not have the closed form, which makes deriving a sufficient condition for the identifiability quite challenging. Morikawa and Kim (2021) identified a combination of Logistic models and normal distributions for response and outcome models has a closed form of the integration and derived a sufficient condition for the model identifiability. Beppu et al. (2021) extended the model to a case where the outcome model belongs to the exponential family while the response model is still a Logistic model. However, when the response mechanism is general, simple outcome models such as normal distribution can be unidentifiable.
Example 3.1.
Suppose that the respondents’ outcome model is , and the response model is , where is a known distribution function such that the integration in (2) exists; then, this model is unidentifiable. For example, different parametrization , yields the same value of the observed likelihood.
Recently, widely applicable sufficient conditions have been proposed. Assume that a covariate has two components, , such that
-
(C1)
and
The covariate is called an instrument (D’Haultfœuille, 2010) or a shadow variable (Miao and Tchetgen Tchetgen, 2016). Miao et al. (2019) derived sufficient conditions for model identifiability by combining the instrument and the completeness condition:
-
(C2)
For all square-integrable function , almost surely implies almost surely.
Lemma 3.2 (Identification condition by Miao et al. (2019)).
Under the conditions (C1) and (C2), the joint distribution is identifiable.
Although the completeness condition is useful and applicable for general models, a simple model with a categorical instrument does not hold the completeness condition.
Example 3.3 (Violating completeness with categorical instrument).
Suppose follows the normal distribution , and an instrument is binary taking or . This distribution does not satisfy the completeness condition because the conditional expectation when .
A vital implication of Example 3.3 is that instruments are no longer evidence of model identification when the instrument is categorical. Developing the identification condition for models with discrete instruments is important in applications (Ibrahim et al., 2001). We separately discuss two cases: (i) both and are categorical; (ii) respondents’ outcome model has the monotone-likelihood ratio property.
When all variables, and , are categorical, the model can be fully nonparametric. Theorem 3.4 demonstrates that, under these conditions, the completeness and identifiability conditions are equivalent. See Appendix 2 in Riddles et al. (2016) for the estimation of such fully nonparametric models.
Theorem 3.4.
When both and are categorical, under condition (C1), the joint distribution is identifiable if and only if condition (C2) holds.
As evidenced in Lemma 3.2, condition (C2) is generally sufficient for model identifiability, but Theorem 3.4 also reveals that it is necessary when and are categorical.
Next, we consider the identification condition for the other case (ii). Let be the support of the random variable . We assume the following four conditions:
-
(C3)
The response mechanism is
(3) where , and are known continuous strictly monotone functions, and and are known injective functions of and , respectively.
-
(C4)
The density or mass function is identifiable, and its support does not depend on .
-
(C5)
For all , there exist and , such that , and is monotone.
-
(C6)
The condition (C3) means that the random variable plays a role of an instrument. The condition (C4) is the identifiability of , which is testable from the observed data. The condition (C5) assumes a monotone-likelihood property on the outcome model, which was also used in Wang et al. (2014) for the complete data. The condition (C6) is necessary for (1) to be well-defined. It is essentially the same condition as Theorem 3.1 (I1) of Morikawa and Kim (2021). This condition is always true when the support of is finite. However, it must be carefully verified when is continuous. See Proposition 3.8 below for useful sufficient conditions when the respondents’ outcome model is normal distribution.
Under conditions (C3)–(C6), we obtain the desired identification condition.
Theorem 3.5.
The parameter is identifiable if the conditions (C1) and (C3)–(C6) hold.
We provide an example of outcome models satisfying the condition (C5).
Example 3.6 (Model satisfying (C5)).
Let density functions in the exponential family be
where , , , and . Then the density ratio becomes
where and . Therefore, the density ratio is monotone.
Example 3.7 (Model satisfying (C6)).
In application, it is often reasonable to assume a normal distribution on the respondents’ outcome model. Focusing on the tail of the outcome model, we provide a sufficient condition to check (C6) for models with general response mechanisms.
Proposition 3.8.
Suppose that the observed distribution is normal distribution , the response mechanism is (3) with and , and the strictly monotone increasing function meets the following condition:
(4) |
Then, this model satisfies (C6).
The condition (4) is easy to check. For example, it holds for Logistic and Robit functions but not for the Probit function. According to Proposition 3.8, it is possible to estimate with observed data using splines and other nonparametric methods, which allows us to use very flexible models. Furthermore, we can also estimate the response mechanism using nonparametric methods because it does not impose any restrictions on the functional form of .
4 Numerical experiment
We present the effects of identifiability in numerical experiments by comparing weak and strong identifiable models. We prepared four Scenarios S1–S4:
-
S1:
(Outcome: Normal, Response: Logistic)
, , , and , where and . -
S2:
(Outcome: Normal, Response: Cauchy)
, , , and , where , , and is the cumulative distribution function of the Cauchy distribution. -
S3:
(Outcome: Bernoulli, Response: Probit)
, , , and , where , , , and is the cumulative distribution function of the standard normal. -
S4:
(Outcome: Normal+nonlinear mean structure, Response: Cauchy or Logistic)
, , , and , where , , and is the cumulative distribution function of the Cauchy or Logistic distribution.
In S1 and S2, the strength of the identification can be adjusted by changing the parameter because indicates that the model is unidentifiable by Example 3.1. On the other hand, we can verify that the models in S3 and S4 are identifiable by Theorem 3.5. For example, in S4, we can see that checking (C3) and (C4) is straightforward to the setting, while (C5) and (C6) hold from Example 3.6 and Proposition 3.8, respectively. From S3 and S4, we can confirm the successful inference even in the case of discrete outcome and complex mean structures, respectively.
We generated 1,000 independent Monte Carlo samples and computed two estimators for and with two methods: fractional imputation (FI) and complete case (CC) estimators, which use only completely observed data. The estimator for is computed by the standard inverse probability weighting method with estimated response models (Riddles et al., 2016). We used correctly specified models for Scenarios S1–S3 but used nonparametric models for Scenario S4 because it is unrealistic to assume that the complicated mean structure is known. The R package ‘crs’ specialized in nonparametric spline regression on the mixture of categorical and continuous covariates (Nie and Racine, 2012) is used to estimate the respondents’ outcome model. Response models are estimated by using the method discussed in Section 2.2.
Bias, root mean squared error (RMSE), and coverage rate for 95% confidence intervals in S1–S4 are reported in Table 1. In all the Scenarios, CC estimators have a significant bias, and the coverage rates are far from 95%, while FI estimators work well when the model is surely identifiable. When is small in S1 and S2, the performance of variance estimation with FI is poor, as expected, although that of point estimates is acceptable. The results in S4 indicate that the model is identifiable even if we use a nonparametric mean structure, and the estimates are almost the same between the two response models.
Scenario | Parameter | Method | Bias | RMSE | CR | |
1.0 | CC | 0.053 | 0.066 | 73.5 | ||
FI | 0.000 | 0.043 | 95.4 | |||
0.5 | CC | 0.039 | 0.053 | 80.9 | ||
FI | -0.001 | 0.059 | 97.1 | |||
S1 | 0.1 | CC | 0.034 | 0.049 | 83.0 | |
FI | 0.021 | 0.136 | 99.8 | |||
1.0 | FI | 0.001 | 0.163 | 95.2 | ||
0.5 | FI | 0.003 | 0.330 | 98.6 | ||
0.1 | FI | -0.146 | 0.865 | 100 | ||
1.0 | CC | 0.146 | 0.152 | 5.7 | ||
FI | -0.004 | 0.051 | 94.8 | |||
0.5 | CC | 0.130 | 0.136 | 7.7 | ||
FI | -0.008 | 0.086 | 86.2 | |||
S2 | 0.1 | CC | 0.127 | 0.133 | 9.4 | |
FI | -0.007 | 0.105 | 92.4 | |||
1.0 | FI | 0.008 | 0.148 | 95.4 | ||
0.5 | FI | 0.044 | 0.365 | 100 | ||
0.1 | FI | 0.033 | 0.448 | 100 | ||
– | CC | 0.100 | 0.102 | 0.3 | ||
S3 | – | FI | 0.001 | 0.022 | 95.3 | |
– | FI | -0.023 | 0.279 | 95.0 | ||
– | CC(Logistic) | 0.341 | 0.355 | 5.4 | ||
– | FI(Logistic) | 0.005 | 0.079 | 95.4 | ||
– | CC(Cauchy) | 0.296 | 0.312 | 10.7 | ||
S4 | – | FI(Cauchy) | 0.007 | 0.080 | 94.3 | |
– | FI(Logistic) | 0.006 | 0.050 | 94.7 | ||
– | FI(Cauchy) | 0.011 | 0.063 | 93.8 |
5 Real data analysis
We analyzed a dataset of 2139 HIV-positive patients enrolled in AIDS Clinical Trials Group Study 175 (ACTG175; Hammer et al. (1996)). In this analysis, we specify 532 patients for analysis who received zidovudine (ZDV) monotherapy. Let each , , and be the CD4 cell count at weeks, at the baseline, and at weeks, be the CD8 cell count at the baseline, and be sex. The outcome was subject to missingness with a 60.34% observation rate, while all covariates were observed. To make estimation stable and easy, we standardized all the data. We expect that (sex) is a reasonable choice for an instrument variable because the information is a biological value, which affects the value of CD4, but has little effect on the response probability.
Patients who are suffering from a mild illness of HIV tend to have higher CD4 cell count; thus, one may consider that missingness of the outcome relates to serious conditions and may expect that the missing value of the outcome would be a lower CD4 cell count than the respondent. We therefore considered five different MNAR response models:
where represents either the Logistic function or the distribution functions of the Cauchy or distribution with degrees of freedom . Theorem 3.5 and Proposition 3.8 ensure that all the models with these five response models are identifiable, even when the instrumental variable is discrete. From the above conjecture on missing values, the sign of is expected to be negative. We assumed that the respondent’s outcome is a normal distribution with a nonparametric mean structure and estimated by the ‘crs’ R package as considered in Scenario S4 in Section 4. The residual plots shown in Figure 1 and the computed -value signify the assumed distribution on the respondents’ outcome fit well. Table 2 reports the estimated parameters and their estimated standard errors calculated by 1,000 bootstrap samples. The results of the five response models were almost similar. This suggests that the response mechanism is robust to the choice of response models. Although we cannot determine whether it is MNAR or MAR because the estimated standard error for is large, the point estimate is negative, as we expected. This result is consistent with the result in Zhao et al. (2021).

Parameter | Model | Estimate | SE | Parameter | Model | Estimate | SE |
---|---|---|---|---|---|---|---|
Logistic | 0.464 | 0.104 | Logistic | 0.125 | 0.156 | ||
Cauchy | 0.417 | 0.260 | Cauchy | 0.108 | 0.139 | ||
0.341 | 0.081 | 0.091 | 0.113 | ||||
0.306 | 0.069 | 0.082 | 0.102 | ||||
0.295 | 0.066 | 0.080 | 0.099 | ||||
Logistic | 0.255 | 0.192 | Logistic | 0.093 | 0.107 | ||
Cauchy | 0.244 | 0.207 | Cauchy | 0.083 | 0.097 | ||
0.196 | 0.148 | 0.069 | 0.079 | ||||
0.169 | 0.126 | 0.062 | 0.070 | ||||
0.160 | 0.120 | 0.060 | 0.068 | ||||
Logistic | -0.032 | 0.314 | Logistic | 276.70 | 13.476 | ||
Cauchy | -0.030 | 0.387 | Cauchy | 276.51 | 14.107 | ||
-0.027 | 0.235 | 276.57 | 13.437 | ||||
-0.021 | 0.203 | 276.61 | 13.271 | ||||
-0.019 | 0.194 | 276.63 | 13.217 |
6 Conclusion
In this paper, we proposed a new identification condition for models using respondents’ outcome and response models. Although our method requires the specification of the two models, the model can be very general with the help of an instrument. As considered in Scenario S4 in Section 4, the mean function in the respondents’ outcome model can be nonparametric, and the response model can be any strictly monotone function, other than Logistic models. Our condition guarantees model identifiability even when instruments are categorical, which is not covered by previous conditions. Another advantage of using our method is the identification condition is easy to verify with observed data.
However, our method has some limitations. First, respondents’ outcome models need to have the monotone-likelihood property by Condition (C5). For example, we cannot deal with mixture models in our framework. Second, the specification of instruments is necessary in advance. To date, some studies on finding the instruments have been proposed (Zhao et al., 2021), but there are still no gold standard methods.
Funding
Research by the second author was supported by MEXT Project for Seismology toward Research Innovation with Data of Earthquake (STAR-E) Grant Number JPJ010217.
References
- (1)
- Beppu et al. (2021) Beppu, K., Morikawa, K., and Im, J. (2021), “Imputation with verifiable identification condition for nonignorable missing outcomes,” arXiv preprint arXiv:2204.10508, .
- Cui et al. (2017) Cui, X., Guo, J., and Yang, G. (2017), “On the identifiability and estimation of generalized linear models with parametric nonignorable missing data mechanism,” Computational Statistics & Data Analysis, 107, 64–80.
- D’Haultfœuille (2010) D’Haultfœuille, X. (2010), “A new instrumental method for dealing with endogenous selection,” Journal of Econometrics, 154(1), 1–15.
- D’Haultfœuille (2011) D’Haultfœuille, X. (2011), “On the completeness condition in nonparametric instrumental problems,” Econometric Theory, 27(3), 460–471.
- Hammer et al. (1996) Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gundacker, H., Schooley, R. T., Haubrich, R. H., Henry, W. K., Lederman, M. M., Phair, J. P., Niu, M. et al. (1996), “A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter,” New England Journal of Medicine, 335(15), 1081–1090.
- Hu et al. (2022) Hu, W., Wang, R., Li, W., and Miao, W. (2022), “Paradoxes and resolutions for semiparametric fusion of individual and summary data,” arXiv preprint arXiv:2210.00200, .
- Ibrahim et al. (2001) Ibrahim, J. G., Lipsitz, S. R., and Horton, N. (2001), “Using auxiliary data for parameter estimation with non-ignorably missing outcomes,” Journal of the Royal Statistical Society: Series C (Applied Statistics), 50(3), 361–373.
- Imbens and Rubin (2015) Imbens, G. W., and Rubin, D. B. (2015), Causal inference in statistics, social, and biomedical sciences Cambridge University Press.
- Kim (2011) Kim, J. K. (2011), “Parametric fractional imputation for missing data analysis,” Biometrika, 98(1), 119–132.
- Kim and Shao (2021) Kim, J. K., and Shao, J. (2021), Statistical methods for handling incomplete data CRC press.
- Little and Rubin (2019) Little, R. J., and Rubin, D. B. (2019), Statistical analysis with missing data, Vol. 793 John Wiley & Sons.
- Miao et al. (2016) Miao, W., Ding, P., and Geng, Z. (2016), “Identifiability of normal and normal mixture models with nonignorable missing data,” Journal of the American Statistical Association, 111(516), 1673–1683.
- Miao et al. (2019) Miao, W., Liu, L., Tchetgen, E. T., and Geng, Z. (2019), “Identification, doubly robust estimation, and semiparametric efficiency theory of nonignorable missing data with a shadow variable,” arXiv preprint arXiv:1509.02556, .
- Miao and Tchetgen (2018) Miao, W., and Tchetgen, E. T. (2018), “Identification and inference with nonignorable missing covariate data,” Statistica Sinica, 28(4), 2049.
- Miao and Tchetgen Tchetgen (2016) Miao, W., and Tchetgen Tchetgen, E. J. (2016), “On varieties of doubly robust estimators under missingness not at random with a shadow variable,” Biometrika, 103(2), 475–482.
- Morikawa and Kim (2021) Morikawa, K., and Kim, J. K. (2021), “Semiparametric optimal estimation with nonignorable nonresponse data,” The Annals of Statistics, 49(5), 2991–3014.
- Nie and Racine (2012) Nie, Z., and Racine, J. S. (2012), “The crs Package: Nonparametric Regression Splines for Continuous and Categorical Predictors.,” R Journal, 4(2).
- Riddles et al. (2016) Riddles, M. K., Kim, J. K., and Im, J. (2016), “A propensity-score-adjustment method for nonignorable nonresponse,” Journal of Survey Statistics and Methodology, 4(2), 215–245.
- Robins et al. (1994) Robins, J. M., Rotnitzky, A., and Zhao, L. P. (1994), “Estimation of regression coefficients when some regressors are not always observed,” Journal of the American statistical Association, 89(427), 846–866.
- Tang et al. (2003) Tang, G., Little, R. J., and Raghunathan, T. E. (2003), “Analysis of multivariate missing data with nonignorable nonresponse,” Biometrika, 90(4), 747–764.
- Wang et al. (2014) Wang, S., Shao, J., and Kim, J. K. (2014), “An instrumental variable approach for identification and estimation with nonignorable nonresponse,” Statistica Sinica, 24, 1097–1116.
- Yang and Kim (2020) Yang, S., and Kim, J. K. (2020), “Statistical data integration in survey sampling: A review,” Japanese Journal of Statistics and Data Science, 3, 625–650.
- Yang et al. (2019) Yang, S., Wang, L., and Ding, P. (2019), “Causal inference with confounders missing not at random,” Biometrika, 106(4), 875–888.
- Zhao and Ma (2022) Zhao, J., and Ma, Y. (2022), “A versatile estimation procedure without estimating the nonignorable missingness mechanism,” Journal of the American Statistical Association, 117(540), 1916–1930.
- Zhao et al. (2021) Zhao, P., Wang, L., and Shao, J. (2021), “Sufficient dimension reduction and instrument search for data with nonignorable nonresponse,” Bernoulli, 27, 930–945.
Appendix A Technical Proofs
We first provide a technical result to prove Theorem 3.4.
Lemma A.1.
Let , , and be any positive real numbers. Assume that and are positive real numbers satisfying
(5) |
Then, there exist such that
(6) |
and
(7) |
Proof of Lemma A.1.
By using a polar coordinate system, we transform into
where to ensure satisfy (6). It follows from (7) and double-angular formulas that we have
(8) | ||||
(9) | ||||
(10) |
where , and . Setting and equations (8) and (9) yield
Fixing reduces the above equations to the one common equation
(11) |
maintaing the condition . It remains to show that there exists satisfying (10) and (11). Solving the equation (11) with respect to , we have
(12) |
Substituting (12) into (10) leads to the following quadratic equation with respect to :
It follows from (5) that
which implies that there is at least one solution of to the equation in the open interval .
∎
Proof of Theorem 3.4.
Without loss of generality, we set the value of to a fixed vector because the following proof holds for each . Let the categorical variables and take values in and , respectively. We show that model identifiability implies the completeness condition (C2) by individually addressing three cases: (i) , (ii) , and (iii) because “if” part has been already established by Lemma 3.2.
When , condition (C1) results in the rank of a matrix, composed of in its -th element (), being . Hence, identifiable models always satisfy the completeness condition (C2).
For cases where , we must show that the model becomes unidentifiable when the completeness condition is violated. The breach of the completeness condition indicates the existence of a non-zero vector such that for , we have
(13) |
The elements in do not all share the same sign, and multiplying this vector by any constant does not affect the above equation. Recall that the model’s unidentifiability implies that exists for some , satisfying . We now construct an unidentifiable model when the completeness condition is violated.
When , without loss of generality, we assume , , and satisfying the condition for all . Employing Lemma A.1 with , , , and , we derive:
where . Substituting , , and into shows that the model is unidentifiable.
Lastly, we consider the case of . Suppose satisfies (13). Within , we select three elements with signs as positive, positive, and negative, respectively, and define them as , , and where , and is set to be sufficiently large to ensure that
(14) |
For ease of notation, we denote . The remaining part of the proof is similar when the combination of the signs is negative, negative, and positive. With the selected , are determined to be sufficiently small to satisfy
(15) | |||
Furthermore, we define and as
(16) |
By determining the variables through these steps, it follows from (14), (15), and (16) that condition (5) with , , and is fulfilled:
Therefore, by applying Lemma A.1, we demonstrate that there exist , , and such that
The condition (13) suggests that the constructed satisfy for and, for any ,
Therefore, the model is unidentifiable.
∎
Proof of Theorem 3.5.
We consider when is continuous because when is discrete, we just need to change the integral to summation. To simplify the discussion, we consider the case where . Let be a fixed value. Because and are injective functions, it is sufficient to prove the case where and . Therefore, our goal is to prove
implies , and . Integrating both sides of the above equation with respect to yields the equality of the denominator. Thus, we have ; this implies by (C4).
Next, we consider the identification of . Taking and such that they satisfy (C5), we show that
(17) | ||||
(18) |
implies . It follows from (17) and (18) that
(19) |
where . It remains to show that (19) implies in the following two steps:
Step . We prove that the function has a single change of sign when . Assume that . The equation has only one solution satisfying because of the injectivity of the function and . This implies has a single change of sign.
Step . We prove that the equation (19) does not hold when . Without loss of generality, by Step , we consider a case where and , and is monotone increasing. Let be the upper bound of the density ratio
By a property on shown in (19), we have
where the inequality follows from the definition of . This results in the density ratio being a constant on , hence, on . This contradicts with (C5), thus .
Finally, from the strict monotonicity of , it follows that the integration
is injective with respect to . Therefore, equation (17) implies that .
∎