A Mathematical Approximation of the Left-sided Truncated Normal Distribution Using the Cadwell Approximation Model

In the case that life distribution of new devices follows the normal distribution, the life distribution of the same brand used devices follows left-sided truncated normal distribution. In spite of many mathematical models being available to approximate the normal distribution density functions, there is a few work available on modeling/approximating the density functions of left-sided truncated normal distribution. This article introduces a high accuracy mathematical model to approximate the cumulative density function of left-sided truncated standard normal distribution defined on the range of [truncation point (ZL): ∞]. The introduced model is derived from the Cadwell approximation of the normal cumulative density. The accuracy level change with Z score is discussed in details. The maximum deviation of the model results, from the real results for the whole region of [-∞<ZL<-2: ∞], is 0.006877. Keywords-truncated normal distribution; normal distribution; mathematical approximation


INTRODUCTION
If is a continuous random variable X having a normal distribution with mean of μ and standard deviation of σ, the normal curve can be graphed using (1) The normal distribution can take any mean and any positive standard deviation. These two parameters determine the shape of the normal curve. The mean provides the location of middle of distribution, and the standard deviation defines how much the data are spread from the mean.
Normal distribution is used to model many sets of measurements in industry, business and nature. For instance, the human blood sugar level, the lifetime of radio sets, and even automobile costs are all normally distributed. A normal distribution has many unique properties including that 1) the value for mean, median, and mode are equal, 2) the distribution curve is bell shaped and symmetric about the mean, 3) the total area under the normal curve equals to 1 (all probability distributions have this property), 4) the normal curve approaches to zero with X going to -∞ or ∞, and finally, 5) around the center of the curve, the graph curves downward (concave). Further, the graph curves upward at the right and the left sides. The inflection points are the points at which the curve changes from curving downward to upward. The normal distribution with σ equals to 1 and μ equals to 0 is called the standard normal distribution. The standard normal distribution function is shown in (2).
The standard normal distribution horizontal scale is well known as Z score. Any not standard normal distribution can be transferred into Z score with using of the following transformation formula: There are many situation where the interest is to compute the probability that the observed value X is less than or equal to some real number x. Therefore, the cumulative distribution function is defined as F(x) = P(X ≤ x). For every real number x, F(x) is defined, as shown in (3).
The cumulative distribution function for the normal distribution is denoted by Φ(z) as addressed in the following equation: The solution of the above mentioned equation is complex and cannot be solved manually. Therefore, the statisticians prepared normally distributed and a graduate school has a minimum acceptable score, then the distribution of the admitted student scores is basically left-sided truncated normal distribution. This is because the left tail of the normal distribution (e.g., scores lower than minimum acceptable GRE scores) is truncated. Another example, if the life in years, of a television set is normally distributed, then the distribution of a two years used television set from the same brand is left-sided truncated normal distribution. The tail of distribution at P(X<2) is truncated. The following figure (Figure 1) is an example on the left-sided truncated normal distribution. The left-sided truncated standard normal distribution is defined on the region (Z L :∞). Since the final area under the curve of truncated distribution must equal to 1, the new curve is stretched up to compensate the lost truncated area over the region (-∞: Z L ). The probability density function of the leftsided truncated normal distribution is addressed in (5) , ) ( The cumulative distribution function for the left-sided truncated normal distribution can be defined as the cumulative density function divided by the area under the original distribution function over the region (z L : ∞). Equation (6) represents the cumulative distribution function of left-sided truncated standard normal distribution.
Equation (6) is a complex integration function, and need a numerical solution to be solved. Therefore, engineers usually use sophisticated computer programs or specialized software package to handle this kind of equations. However, in the case that φ(z) is approximated with function that is able to be differentiated and integrated, then the truncated normal distribution can be estimated in a function.

II. REVIEW OF PRIOR RESEARCH
Much research on mathematical approximation of normal distribution functions is available in the literature. At least 15 highly accurate models were introduced. Cadwell (1949) [1][2][3] introduced a very nice approximating model, as shown in (7) The following 6 models are mentioned in [1,2] Many authors discussed in details many aspect about the truncated normal distribution. For example, Ke et al. [7] used double-sided truncated normal distribution on action reliability for ammunuition swing device of large caliber gun. Pender [8] provided an exact expression for the moments of the truncated normal distribution using Stein's lemma. His moment expressions provide insight into the steady state skewness and kurtosis dynamics of single server queues with impatient customers. Sun et al. [9] worked on finite fault modelling for the Wenchuan earthquake using hybrid slip model with truncated normal distributed source parameters. Beucher and Renard [10] described a stochastic model called truncated Gaussian simulations (TGS), which distributes a collection of facies or lithotypes over an area of interest. This method is based on facies proportions, spatial distribution and relationships, which can be easily tuned to produce numerous different textures. Mukerjee and Ong [11] worked on variance and covariance inequalities for truncated joint normal distribution via monotone likelihood ratio and log-concavity. Horrace [12] developed probability statements and ranking and selection rules for independent truncated normal populations. Sharples and Pezzey [13] presented results pertaining to truncated multivariate normal distributions, some of which already appear in the mathematical literature. They focused to make these types of results more accessible to the environmental science community and to this end, they included a conceptually simple alternative derivation of an important result. Furthermore, they illustrated how the theory of truncated multivariate normal distributions is employed in the environmental sciences.
We can conclude from the above discussion that there is a lot on work on mathematical aspects and applications of the normal cumulative distribution function. Even so, mathematical approximations on the truncated normal distribution are not that frequent. In this paper, a mathematical approximation model of left-sided truncated normal cumulative  (7) is separated into two parts according to z value, the model will be divided on two parts. We are going to formulate a model for the left part (Z<0) first. The left part is recalled, as extracted and simplified in (8).
The left-sided truncated standard normal distribution can be obtained through integration as explained in (6). Remember that φ(z) =dΦ(z)/dz. The following equation can be used to obtain the first part of our model, as addressed in (9). , z<0 (9) The solution of the equation will give the following formula (written in the simplest form). If Z>0, then approximating Փ T (z) needs to extract the positive z part of (7) and substitute it in (6), exactly as we did in (9). The result is addressed in (11). The interest of this question is to find the 6 year reliability of 6 years used television from the same brand. In other words, what is the chance that the 6 years old television will be survived for another 6 years?
Solution: Reliability is defined as the probability that a device, part or system will perform their function for a given period of time when operated under stated conditions [14]. The requirement can be written in statistics notation as P(X> 6+6). Since this distribution is not standard, the transformation formula, z=(x-µ)/σ will be used to find z L and z that are corresponding to x=6 and x=12, respectively. The probability P(X>12) is corresponding to P(Z>-1.249) or 1-Փ T (-1.249). By using (11), the result is 0.905713. The model result is very close to the actual result which is 0.904229. The deviation of the model result from the actual result (i.e., error) is only 0.00148 Example 2: Let's assume that the distribution of scores of GRE analytical part test out of 800 is normally distributed with a mean of 540 points and standard deviation of 40 points, and let's assume that the acceptable score for some graduate school is 460. If an admitted student is selected randomly, what is the probability that his score is below 620? Solution: Since the value of probability density at any X greater than 800 (more than is 6.5 sigma) is very close to zero, we will not assume that the distribution is truncated from the right side at 800. The distribution of the admitted students is left-sided truncated normal distribution at the acceptable level (i.e., 460). z L = -2 is corresponding to x L = 460 by using the transformation formula (i.e., z=(x-µ)/σ). Further, z=2 is corresponding to x=620. In statistic notation, the requirement is P(X<620) which is corresponding to P(Z<2) or Փ T (-1.249). By using (11), the result is 0.9826. The model result is very close to the actual result for this example. The actual result is 0.97672, and deviation of the model result from the actual result is only 0.00588.
It is noticeable from both examples, that the error level is very low. Indeed, this level of error is very ignorable for most applications of reliability engineering as well as other probability fields.

V. ACCURACY ANALYSIS
In this section, the accuracy of the model is presented in term of approximating the left-sided truncated normal distribution. Mainly, the change of deviation of the model results from the real results (i.e., error) with Z and Z L is focused. To make sure that the introduced model is good at all value Z and Z L , we are concerning about the maximum error over the whole defining range, (Z L :∞) at any Z L (-∞:-1.5). Figure 2 presents the error versus Z at Z L = -4, -3.5, -3, -2.5, -2, and -1.5. We can clearly note that there are two maximum peaks (i.e., maximum deviation or error) for every curve. The first peak is noticed at somewhere between Z=-1.6 and Z=-1.7, and the second peak is noticed at somewhere between Z=1.6 and Z=1.7 for all curves. The maximum error for each curve is as follows: for Z L = -4, the maximum absolute error is about 0.006452, for Z L = -3.5, the maximum absolute error is about 0.006456, for Z L = -3, the maximum absolute error is about 0.0065, for Z L = -2.5, the maximum absolute error is about 0.00663, for Z L = -2, the maximum absolute error is about 0.006887, and for Z L = -1.5, the maximum absolute error is about 0.0071251. Usually, the literature does not focus on any truncation point higher that Z L =-2 as it is rarely needed in most applications. In this study, we took Z L =-1.5 as the final truncation point.
In Figure 3, the error versus Z is constructed for the original approximation of standard normal distribution that our model is based on (i.e., Cadwell approximation). We can see that the maximum error is 0.006466 at Z= -1.655 and at Z=1.655. The accuracy of the Cadwell model is close to the accuracy of the truncated Cadwell model at any considered Z L . The current model is more accurate than the logistic function-based truncated normal distribution which introduced in [1]. The current model leads to a maximum absolute error of about 0.0071, while the maximum absolute error in the logistic function-based model is close to 0.02 at the range of -∞<Z L <-1.5. Furthermore, the current model is easier to implement using simple calculator. Figure 4 refers to a comparison graph of the model and real results of Z L =-2 left-sided truncated normal cumulative distribution function. We can notice that they are very close to each other. We can't distinguish the difference between the two curves with the normal paper scale.

VI. CONCLUSIONS
In this article, an approximation to the left-sided truncated cumulative distribution function is introduced. The model can be used by reliability and quality engineers in order to avoid the sophisticated computer programs or software when having such distributions in their work. The model was build based on Cadwell approximation to the normal distribution. This study is unique because it provided model with a higher accuracy than any similar study in the literature. It provided a maximum error of 0.006877 for Z (-2:∞) while logistic function-based approximation [1], for example, provides a maximum error 0.017 on the same range of Z.