equation
(1) |
Asymptotic properties of generalized closed-form maximum likelihood estimators
Summary
The maximum likelihood estimator (MLE) is pivotal in statistical inference, yet its application is often hindered by the absence of closed-form solutions for many models. This poses challenges in real-time computation scenarios, particularly within embedded systems technology, where numerical methods are impractical. This study introduces a generalized form of the MLE that yields closed-form estimators under certain conditions. We derive the asymptotic properties of the proposed estimator and demonstrate that our approach retains key properties such as invariance under one-to-one transformations, strong consistency, and an asymptotic normal distribution. The effectiveness of the generalized MLE is exemplified through its application to the Gamma, Nakagami, and Beta distributions, showcasing improvements over the traditional MLE. Additionally, we extend this methodology to a bivariate gamma distribution, successfully deriving closed-form estimators. This advancement presents significant implications for real-time statistical analysis across various applications.
Keywords: Closed-form estimators; Maximum Likelihood estimators; Generalized maximum likelihood estimator; generalized estimator.
1 Introduction
Introduced by Ronald Fisher [1], the maximum likelihood method is one of the most well-known and used inferential procedures to estimate the unknown parameters of a given distribution. Alternative methods to the maximum likelihood estimator (MLE) have been proposed in the literature, such as those based on statistical moments [13], percentile [14, 15], product of spacings [7], or goodness of fit measures, to list a few. Although alternative inferential methods are popular nowadays, the MLEs are the most widely used due to their flexibility in incorporating additional complexity (such as random effects, covariates, censoring, among others) and their properties: asymptotical efficiency, consistent and invariance under one-to-one transformation. These properties are achieved when the MLEs satisfy some regularity conditions [5, 16, 20].
It is now well established from various studies that the MLEs do not always return closed-form expressions for most common distributions. In these cases, numerical methods, such as Newton-Rapshon or its variants, are usually considered to find the values that maximize the likelihood function. Important variants of the maximum likelihood estimator, such as profile [18], pseudo [11], conditional [2], penalized [3, 10] and marginal likelihoods [8], have been presented to eliminate nuisance parameters and decrease the computational cost. Another important procedure to achieve the MLEs is the expectation-maximization (EM) algorithm [9], which involves unobserved latent variables jointly with unknown parameters. The expectation and maximization steps also involve, in most cases, the use of numerical methods that may have a high computational cost. However, there is a need to use closed-form estimators to estimate the unknown parameters in many situations. For instance, in embed technology, small components need to compute the estimates without using maximization procedures, and in real-time applications, it is necessary to provide an immediate answer.
In this study, we present a generalized approach to the maximum likelihood method, enabling the derivation of closed-form expressions for estimating distribution parameters in numerous scenarios. Our primary objective is to establish the asymptotic normality and strong consistency of our proposed estimator. Furthermore, we demonstrate that these conditions are significantly simplified within a broad family of generalized maximum likelihood equations. The practical implications of our findings are substantial, offering efficient and rapid computational methods for obtaining estimates. Most importantly, our results facilitate the construction of confidence intervals and hypothesis tests, thus broadening their applicability in various fields.
The proposed method is illustrated regarding the Gamma, Beta, Nakagami distributions and a bivariate gamma model. In these cases, the standard ML does not have closed-form expressions, and numerical methods or approximations are necessary to find these distributions’ solutions. Hence our approach does not require iterative numerical methods, and, computationally, the work required by using our estimators is less complicated than that required in ML estimators. The remainder of this paper is organized as follows. Section 2 presents the new generalized maximum likelihood estimator and its properties. Section 3 considers the application of the Gamma, Nakagami, and Beta distributions. Finally, Section 4 summarizes the study.
2 Generalized Maximum Likelihood Estimator
The method we propose here can be applied to obtain closed-form expressions for distributions with a given density . In order to formulate the method, let represent the sample space, let represent the space of the data , where is equipped with a measure , which can be either discrete or continuous, let be an open set containing the true parameter , to be estimated, and for each we let , be an open set, possibly depending on , containing a fixed parameter , which represents additional parameters which will be used during the procedure to obtain the estimators.
Now, suppose , , , are independent and identically distributed (iid) random variables, which can be either discrete or continuous, with a strictly positive density function . Then, given a function defined for , and we define the generalized maximum likelihood equations for over the coordinates at to be the set of equations
(2) | ||||
as long as these partial derivatives exist and the expected values above are finite.
To see how the generalized likelihood equations generalize the maximum likelihood equations, note that, in case the equation can be differentiated under the integral sign we obtain for all , in which case, letting , it follows that the generalized maximum likelihood equations for over the coordinates are given by the equations
which coincide with the maximum likelihood equations. This differentiation under the integral sign condition is in fact a natural condition to impose, since it is universally used in order to prove the consistency and asymptotic normality of the maximum likelihood estimator.
From now on, our goal shall be that of giving conditions guaranteeing existence of solutions for the generalized maximum likelihood equations as well as conditions under which an obtained solution of the generalized maximum likelihood equations is a consistent estimator for the true parameter , and is asymptotically normal. In order to formulate the result, given a fixed we denote
(3) |
for all , and , where . Moreover, we let and , be defined by
(4) | ||||
for and . These matrices shall play the role that the Fisher information matrix plays in the classical maximum likelihood method.
In the following, we say an estimator satisfy the modified likelihood equations (10) with probability converging to one strongly if, letting denote the subset of in which satisfy (10), we have
(5) |
More generally, we say a series of events , , happen with probability converging to one strongly if (5) is valid.
In the following we prove a result regarding the existence and strong consistence of solutions of the modified likelihood equations (10) for an arbitrary probability density function .
Theorem 2.1.
Denote , where , , are iid with density and suppose:
-
(A)
and , as defined in (12), exist and is invertible.
-
(B)
is measurable in and exist and is continuous in , for all and , where is given in (3).
-
(C)
There exist measurable functions and an open set containing the true parameter such that and for all and we have
for all and .
Then, with probability converging to one, the generalized maximum likelihood equations has a solution. Specifically, there exist measurable in such that:
-
I)
satisfy the modified likelihood equations (10) with probability converging to one strongly.
-
II)
is a strongly consistent estimator for
-
III)
.
Proof.
The proof is available in the Appendix. ∎
Note that if in the above result, then condition corresponds to requiring the Fisher information matrix to be invertible, since in such case .
As a corollary of the above result we have in special the following theorem, which simplifies the above conditions to , when is contained in a certain family of measurable functions.
Theorem 2.2.
Denote , where , , are iid with density , let be defined as
where and are for all , is measurable and positive, is measurable in , the partial derivatives exist for all and , and suppose:
-
(A)
and , as defined in (12), exist and is invertible.
-
(B)
and are finite, for all , and .
Then, with probability converging to one as , the generalized maximum likelihood equations has a solution. Specifically, there exist measurable in such that:
-
I)
satisfy the modified likelihood equations (10) with probability converging to one strongly.
-
II)
is a strongly consistent estimator for
-
III)
.
Proof.
The proof is available in the Appendix. ∎
Note the family of measurable functions imposed above for is more general than the exponential family of distributions, and besides, is not even required to be a probability density function. Additionally, note that no restrictions are made over besides being a probability density function. Thus, we consider this result to be important since it provides an infinite number of possible estimators, due to the infinite possible choices for , and besides, provides easy to verify conditions for obtained estimators to be strongly consistent and asymptotically normal.
Now, in order to define the generalized generalized likelihood equations under change of variables, given a diffeomorphism and letting for all , and , we let the generalized maximum likelihood equations for at be defined by the set of equations
(6) | ||||
as long as these partial derivatives exist and the expected values are finite.
Proposition 2.3 (One-to-one invariance).
Suppose , where , and and are open sets, suppose can be written as
where and are diffeomorphisms, suppose that for some , with probability one on , is a solution for the generalized maximum likelihood equations for at . Then, with probability one in , is a solution to the generalized likelihood equations for at .
Proof.
Since does not depend on and does not depend on , it follows that for all and thus, letting for all , and letting for all , from the chain rule it follows that
(7) |
for all and . Moreover, by hypothesis, with probability one in , satisfy
(8) |
for all and . Thus, denoting it follows combining (7) and (8) that
with probability one on . That is, satisfy with probability one on the first equation in (6). Additionally since by hypothesis does not depend on the variable for given it follows that
for all and , from which it follows using the hypothesis, just as before, that
for , with probability one on , which concludes the proof. ∎
In general, MML estimators will not necessarily be functions of sufficient statistics. Additionally, in our applications, we shall use as a generalized version of the distribution in order to obtain the generalized maximum likelihood estimators. As we shall see, due to the high number of new distributions introduced in the past decades, it is not difficult to find such functions generalizing . In the next section, we present applications of the proposed method.
3 Examples
We illustrate the proposed method by applying it to the Gamma, Nakagami-m, and Beta distributions. The examples are presented for well-known distributions, so we shall not present their backgrounds. The standard MLEs for the cited distributions are widely discussed in statistical books, which shows that no closed-form expression can be achieved using the MLE method.
The Gamma and the Nakagami-m distributions are particular cases of the generalized Gamma distribution while the Beta distribution is a special case of the generalized Beta distribution. Therefore, we will consider this generalized distributions to obtain the generalized maximum-likelihood equations used to obtain the closed-form estimators.
As we shall see, in all examples presented here we shall have
(9) |
for all . This should be expected in these examples due to differentiation under the integral sign of the equation , since in these examples is a special case of , and is a probability distribution. In special, in these examples the generalized maximum likelihood equations shall be given by
(10) | ||||
and moreover and shall be given by
(11) | ||||
for all and , where is as in (3). Additionally, since we shall use only well known distributions , whose Fisher information matrix can be computed either by
(12) | ||||
where it follows that, in these examples, and are submatrices of .
Example 1: Let us consider that , , , are iid random variables (RV) following a gamma distribution with probability density function (PDF) given by:
(13) |
where is the shape parameter, is the scale parameter and is the gamma function.
We can apply the generalized maximum likelihood approach for this distribution by considering the density function representing the generalized gamma distribution, where , and , given by
(14) |
In order to formulate the generalized maximum likelihood equations for this distribution we first note that
(15) | ||||
which, combined with implies that
that is, (9) is satisfied. Thus, the generalized likelihood equations for over the coordinates at are given by
Following [17], as long as the equality does not hold, we have , in which case a direct computation shows the generalized likelihood equations above has as only solution
(16) |
On the other hand, the MLE for and would be obtained by solving the non-linear system of equations
(17) |
where is the digamma function.
We now apply Theorem 2.2 to prove that the obtained estimators are consistent and asymptotically normal.
Proposition 3.1.
and are strongly consistent estimators for the true parameters and , and asymptotically normal with and .
Proof.
To check condition of Theorem 2.2 note that, for and using reparametrization over the Fisher information matrix from the GG distribution available in [12] it follows that the Fisher information matrix under our parametrization satisfy
(18) |
for , where
Therefore since, as discussed earlier, and can be computed as submatrices of , we have
(19) |
and thus, since , it follows that is invertible for all and with
that is, condition is verified. Additionally, after some algebraic computations, one can verify that
(20) |
Item is straightforward to check from (2.2). Thus conditions and of Theorem 2.2 are valid and therefore, from Theorem 2.2, we conclude there exist measurable in satisfying items to of Theorem 2.2.
Now, since the equation has probability zero of ocurring for , it follows that as given in (16) is, with probability one, the only solution of the generalized maximum likelihood equations for . This fact combined with item of Theorem 2.2 implies that . Thus the proposition follows from items and of Theorem 2.2 combined with (20). ∎
Note that the MLE of differs from the obtained using our approach, which leads to a closed-form expression. Figure 1 presents the Bias and root of the mean square error (RMSE) obtained from replications assuming and and . We presented only the results related to , since the estimator of is the same using both approaches. It can be seen from the obtained results that both estimators’ results in similar (although not the same) results.

Example 2: Let , , , be iid random variables following a Nakagami-m distribution with PDF given by
for all , where and .
Once again letting as in (14), just as in Example 1, following [19], as long as does not hold, it follows that , in which case the generalized maximum likelihood equations for over the coordinates at has as only solution
The estimator has a similar expression as that of the MMLEs of the Gamma distribution. Once again, we note these estimators are strongly consistent and asymptotically normal:
Proposition 3.2.
and are strongly consistent estimators for the true parameters and , and asymptotically normal with and .
Proof.
The arguments and computations involved are completely analogous to that of Proposition 3.4. ∎
Here, we also compare the proposed estimators with the standard MLE. In Figure 2 we present the Bias and RMSE obtained from replications assuming and and . We also presented only the results related to . It can be seen from the obtained results that both estimators returned very close estimates.

Note that the approach given above can be considered for other particular cases. For instance, the Wilson-Hilferty distributions are obtained when . Hence, we can obtain closed-form estimators for cited distribution as well. It is essential to mention that, in the above examples, we do not claim that the GG distribution is the unique distribution which can be used to obtain closed-form estimators for the Gamma and Nakagami distributions. Different choices for may lead to different closed-form estimators.
Now we apply the proposed approach in a generalized version of the beta distribution that will return a closed-form estimator for both parameters.
Example 3: Let us assume that the chosen beta distribution has the PDF given by
(21) |
where is the beta function, , .
We can apply the generalized maximum likelihood approach for this distribution by considering the function representing the generalized beta distribution, where , given by:
Once again in order to formulate the generalized maximum likelihood equations for over the coordinates at we note that
(22) |
from which it follows that
that is, (9) is satisfied. Thus the generalized likelihood equations for over the coordinates at are given by
Note that, from the harmonic-arithmetic inequality, as long as the equality does not hold, we have and , in which case, after some algebraic manipulations it is seem that the only solutions for the above system of linear equations are given by
(23) |
(24) |
In the following, we apply Theorem 2.2 prove that these estimators are consistent and asymptotically normal.
Proposition 3.3.
and are strongly consistent estimators for the true parameters and , and asymptotically normal with and , where
Proof.
In order to apply Theorem 2.2 we note can be written as
for all , , and , with representing the restriction , where
In order to check condition of Theorem 2.2 note that for and , following the computation of the Fisher information matrix for given in [4], we have
(25) |
Thus, since , it is easy to see is invertible with
Therefore we conclude condition is satisfied, and after some algebraic computations one may find that
(26) |
where is as in the proposition and is a rational function on and .
Figure 3 provides the Bias and RMSE obtained from replications assuming and and . Here we considered the proposed estimator and compared with the standard MLE that does not have a closed-form expression.

Unlike the Gamma and Nakagami distributions, we observed that the closed-form estimator has an additional bias. Although they are obtained from different distributions for many parameter values, they returned similar results. A major drawback of the estimators (23) and (24) is that the properties that ensure the consistency and asymptotic normality do not hold when the values of and are smaller than .
Example 4: Let us consider that , , , are iid random variables (RV) following a gamma distribution with probability density function (PDF) given by:
(27) |
where , and are positive, and .
We can apply the generalized maximum likelihood approach for this distribution by considering the density function representing the generalized gamma distribution given by
where , , are positive, , and , where represents the inequality .
In order to formulate the generalized maximum likelihood equations for this distribution at , letting , , , , and be the means of , , , , and , respectively, where and , and define analogously for . From [21] we have
(28) | ||||
from which it follows that
that is, (9) is satisfied. Thus, the generalized likelihood equations for over the coordinates at are given by
Multiplying the first equation above by we obtain a linear system of equations in , and , from which, using the Cramer rule it will follow that it has as unique solution given by
as long as , where
for .
We now apply Theorem 2.2 to prove that the obtained estimators are strongly consistent and asymptotically normal.
Proposition 3.4.
, and are strongly consistent estimators for the true parameters and , and are asymptotically normal, as long as and
Proof.
In order to apply Theorem 2.2 we note can be rewritten as
for all , positive , , , and , where and satisfy the conditions of Theorem 2.2.
Thus it follows that
that is, condition is verified.
Item of (2.2) is straightforward to check from the relations (28). Thus conditions and of Theorem 2.2 are valid and therefore, from Theorem 2.2, we conclude there exist measurable in satisfying items to of Theorem 2.2.
Now, from the strong law of large numbers, as we have
and thus it follows from the continuous mapping theorem that
In special, due to the alternate characterization of strong convergence, it follows that, with probability converging to one strongly we have , in which case the modified likelihood equations has as its unique solution. This fact combined with item of Theorem 2.2 implies that . Thus the proposition follows, once again, from items and of Theorem 2.2. ∎
4 Final Remarks
We have shown that the proposed generalized version of the maximum likelihood estimators provides a vital alternative to achieve closed-form expressions when the standard MLE approach fails. The proposed approach can also be used with discrete distributions, the results remain valid, and the obtained estimators are still strong consistent, invariant, and asymptotic normally distributed. Due to the likelihood function’s flexibility, additional complexity can be included in the distribution and the inferential procedure such as censoring, long-term survival, covariates, and random effects.
The method introduced in this study particularly benefits from the utilization of generalized versions of the baseline distribution. This aspect not only adds significant impetus to the application of various new distributions that have emerged over recent decades but also underscores their practical relevance. Moreover, given that the estimators derived from these generalized distributions are not uniform in nature, it prompts an insightful comparison among them. Such comparative analyses are instrumental in identifying the most effective estimator, especially when evaluated against specific performance metrics. On a different note, our findings demonstrate that the generalized form is not confined to being a distribution. This realization broadens our investigative scope beyond just generalized density functions, allowing for a more expansive and inclusive exploration of potential solutions.
As shown in Examples 1 and 2, the estimators’ behaviors in terms of Bias and RMSE are similar to those obtained under the MLE for the Gamma and Nakagami distributions. Therefore, corrective bias approaches can also be used to remove the bias of the generalized estimators. For the Beta distribution, the comparison showed different behavior for the proposed estimators. We observed that for specific small values of the parameters, the results might not be consistent. This example illustrates what happens in situations where, for some parameter values, the Fisher information of the generalized distribution has singularity problems. Finally, we discuss an approach to obtain closed-form estimators for a bivariate model which provides some insights that can be used in other multivariate models.
This observation lays the groundwork for further exploration, especially in the realm of real-time statistical estimation. It underscores the need for new estimators for distributions with intricate parameter spaces, tailoring them for rapid computation. This aspect is particularly vital for integration with machine learning methodologies, such as tree-based algorithms, where swift and efficient computational techniques are essential. Our study adds a new dimension to the ongoing discourse in statistical estimation, pivoting towards solutions that are not only theoretically sound but also practically viable in dealing with complex data sets. In an era where data complexity and volume are escalating, our approach heralds a promising direction for developing more agile and adaptable statistical tools, crucial for real-time analysis and decision-making in dynamic environments.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Acknowledgements
Eduardo Ramos acknowledges financial support from São Paulo State Research Foundation (FAPESP Proc. 2019/27636-9). Francisco Rodrigues acknowledges financial support from CNPq (grant number 309266/2019-0). Francisco Louzada is supported by the Brazilian agencies CNPq (grant number 301976/2017-1) and FAPESP (grant number 2013/07375-0).
Appendix
In order to prove Theorem 2.2 we shall need the tecnical lemma that follows.
In the following, given we let and given a matrix we let denote the usual spectral norm defined by . Moreover, given a differentiable function , for open, we denote , and we denote by the Jacobian of at , that is, for all .
Lemma .1.
Let be open, let be , let be invertible, denote and , and suppose that:
Then there exist such that .
Proof.
The proof shall follow from a simple application of the Browder Fixed Point Theorem.
Letting be defined by for all we shall prove that . Indeed, from the chain rule it follows that is differentiable in with
Thus, for all we have
and thus from the mean value inequality we have
(30) |
Moreover, note that
(31) |
Thus, given from inequalities (30) and (31) and the triangle inequality we have
that is, for all , which proves that . Thus, since is continuous, from the Browder Fixed Point Theorem we conclude has at least one fixed point in , and thus
which concludes the proof. ∎
Additionally we shall need the following lemma, regarding elementary properties of the spectral norm:
Lemma .2.
Given , the following items hold
-
i)
, where for .
-
ii)
If is invertible and , then is invertible as well.
Proof.
To prove item applying the Cauchy–Schwarz inequality, we have for all that
which proves that by definition of the spectral norm, and thus the result follows directly from the inequality .
To prove item , note that, under the hypothesis, letting it follows that
which implies that is invertible and thus must be invertible as well, since it is a product of invertible square matrices. ∎
Using the above results we are now ready to prove Theorem 2.1.
Proof.
Existence of solutions:
Letting be as in (3), that is
for all , where and letting be defined by where
for all , , and . Note due to the strong law of the large numbers and from for all that
and thus, from the alternative definition of strong convergence it follows that
(32) |
for all . Now, letting be defined by , where
Condition (B) says that
(33) |
for all and . In special, from (33) and from the dominated convergence theorem it follows that is continuous at . Moreover, denoting for all , from (33) and the uniform strong law of the large numbers it follows that
for all , and thus, once again due to the alternative definition of strong convergence we have
(34) |
for all and . Now, given such that and , where , combining (32) and (34), it follows there exist and a set of probability , such that
(35) |
for all , and . Combining the second inequality of (35) with item of Lemma .2 it follows that:
Now, since is continuous at , there exist an open set such that
(36) |
Combining the above inequalities with the triangle inequality we conclude that
(37) |
for all and . Thus from Lemma (.1) it follows that for each there exist such that
(38) |
which, in special, proves that the generalized maximum likelihood equations has at least one solution with probability converging to one as .
Construction of a measurable estimator:
We shall construct the estimator . Note that if (37) and (38) are valid for some then it is valid for any as well. Thus, without loss of generality we can suppose . Now, given we define
On the other hand, to define for , let be the only integer for which is satisfied. Since it follows that is well defined and as . Now, note that is continuous in for all in and measurable in for all . Thus, is a Carathéodory function for all . Therefore, letting be the multivalued map defined by
(39) |
since is Carathéodory and is compact, it follows from the theory of measurable maps that is a measurable map (see [6], Corollary 18.8 p. 596). Now construct a second multivalued map defined by:
From the measurability of it is clear that is measurable as well, and since is always non-empty, we can apply the measurable selection theorem (see [6], Theorem 18.7 p. 603) to obtain a measurable function satisfying
which concludes the construction of our estimator .
By the construction it follows that satisfy in every point in which the equation has at least one solution in . Thus, it follows that satisfy at every point in which has at least one solution in , and since , from what we proved earlier it follows this happens with probability greater or equal to .
Thus, since as , it follows that, with probability converging to one as , satisfy for all , which proves item .
Now, by construction for all and and since as it follows that as for all , which, in special, proves item .
Asymptotic normality:
From the mean value theorem, for each fixed , and there must exist a contained in the segment connecting to such that
(40) |
On the other hand, letting
for all it follows by hypothesis that is continuous in and measurable in , and thus is a Carathéodory function, from which it follows, once again due to the theory of measurable maps and the measurable selection theorem, that such can be chosen to be measurable in .
Now, letting be defined by where for all , and , since by construction for all and since is contained in the segment connecting to , it follows that for all as well. Thus, once again combining the second inequality from (37) with item of Lemma .2 it follows that
(41) |
Thus, in particular, from lemma .2 it follows is invertible for all and .
On the other hand, since by construction for all it follows from (40) that:
(42) |
Since as it follows from (41) that and thus from the invertibility of the matrices involved, it follows for that
(44) |
as well. Additionally, from the central limit theorem we know that
(45) |
which combined with (45), (43) and the Slutsky’s Theorem, implies that
which concludes the proof, since we already proved that . ∎
Proof.
Now, from hypothesis we see that
From these relations and the hypothesis, it is easy to see that is measurable in and is well defined and continuous in , for all and , that is, item of Theorem 2.1 is also satisfied.
References
- Aldrich et al. [1997] Aldrich, J. et al. (1997). Ra fisher and the making of maximum likelihood 1912-1922. Statistical science 12(3), 162–176.
- Andersen [1970] Andersen, E. B. (1970). Asymptotic properties of conditional maximum-likelihood estimators. Journal of the Royal Statistical Society: Series B (Methodological) 32(2), 283–301.
- Anderson and Blair [1982] Anderson, J. and V. Blair (1982). Penalized maximum likelihood estimation in logistic regression and discrimination. Biometrika 69(1), 123–136.
- Aryal and Nadarajah [2004] Aryal, G. and S. Nadarajah (2004). Information matrix for beta distributions. Serdica Mathematical Journal 30(4), 513p–526p.
- Bierens [2004] Bierens, H. J. (2004). Introduction to the mathematical and statistical foundations of econometrics. Cambridge University Press.
- Charalambos D. Aliprantis [2006] Charalambos D. Aliprantis, K. B. (2006). Infinite Dimensional Analysis: A Hitchhiker’s Guide (3rd ed.). Springer.
- Cheng and Amin [1983] Cheng, R. and N. Amin (1983). Estimating parameters in continuous univariate distributions with a shifted origin. Journal of the Royal Statistical Society. Series B (Methodological), 394–403.
- Cox [1975] Cox, D. R. (1975). Partial likelihood. Biometrika 62(2), 269–276.
- Dempster et al. [1977] Dempster, A. P., N. M. Laird, and D. B. Rubin (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological) 39(1), 1–22.
- Firth [1993] Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika 80(1), 27–38.
- Gourieroux et al. [1984] Gourieroux, C., A. Monfort, and A. Trognon (1984). Pseudo maximum likelihood methods: Theory. Econometrica: journal of the Econometric Society, 681–700.
- Hager and Bain [1970] Hager, H. W. and L. J. Bain (1970). Inferential procedures for the generalized gamma distribution. Journal of the American Statistical Association 65(332), 1601–1609.
- Hosking [1990] Hosking, J. R. (1990). L-moments: Analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society: Series B (Methodological) 52(1), 105–124.
- Kao [1958] Kao, J. H. (1958). Computer methods for estimating weibull parameters in reliability studies. IRE Transactions on Reliability and Quality Control, 15–22.
- Kao [1959] Kao, J. H. (1959). A graphical estimation of mixed weibull parameters in life-testing of electron tubes. Technometrics 1(4), 389–407.
- Lehmann and Casella [2006] Lehmann, E. L. and G. Casella (2006). Theory of point estimation. Springer Science & Business Media.
- Louzada et al. [2019] Louzada, F., P. L. Ramos, and E. Ramos (2019). A note on bias of closed-form estimators for the gamma distribution derived from likelihood equations. The American Statistician 73(2), 195–199.
- Murphy and Van der Vaart [2000] Murphy, S. A. and A. W. Van der Vaart (2000). On profile likelihood. Journal of the American Statistical Association 95(450), 449–465.
- Ramos et al. [2020] Ramos, P. L., F. Louzada, and E. Ramos (2020). Bias reduction in the closed-form maximum likelihood estimator for the nakagami-m fading parameter. IEEE Wireless Communications Letters 9(10), 1692–1695.
- Redner et al. [1981] Redner, R. et al. (1981). Note on the consistency of the maximum likelihood estimate for nonidentifiable distributions. Annals of Statistics 9(1), 225–228.
- Zhao et al. [2022] Zhao, J., Y.-H. Jang, and H.-M. Kim (2022). Closed-form and bias-corrected estimators for the bivariate gamma distribution. Journal of Multivariate Analysis 191, 105009.