4.2.3 Normal (Gaussian) Distribution

The normal distribution is by far the most important probability distribution. One of the main reasonsfor that is the Central Limit Theorem (CLT) that we will discuss later in the book. To give youan idea, the CLT states that if you add a large number of random variables, the distribution of the sumwill be approximately normal under certain conditions. The importance of this result comes from the factthat many random variables in real life can be expressed as the sum of a large number of random variablesand, by the CLT, we can argue that distribution of the sum should be normal. The CLT is one of the mostimportant results in probability and we will discuss it later on. Here, we will introduce normal randomvariables.

We first define the standard normal random variable. We will then see that we can obtain other normalrandom variables by scaling and shifting a standard normal random variable.

A continuous random variable $Z$ is said to be a standard normal (standard Gaussian) random variable,shown as $Z \sim N(0,1)$, if its PDF is given by$$f_Z(z) = \frac{1}{\sqrt{2 \pi}} \exp\left\{-\frac{z^2}{2}\right\}, \hspace{20pt} \textrm{for all } z \in \mathbb{R}.$$

The $\frac{1}{\sqrt{2 \pi}}$ is there to make sure that the area under the PDF is equal to one. We willverify that this holds in the solved problems section. Figure 4.6 shows the PDF of the standard normalrandom variable.

Let us find the mean and variance of the standard normal distribution. To do that, we will use a simpleuseful fact. Consider a function $g(u):\mathbb{R}\rightarrow\mathbb{R}$. If $g(u)$ is an odd function,i.e., $g(-u)=-g(u)$, and $|\int_{0}^{\infty} g(u) du| < \infty$, then$$\int_{-\infty}^{\infty} g(u) du=0.$$For our purpose, let$$g(u)= u^{2k+1}\exp\left\{-\frac{u^2}{2}\right\},$$where $k=0,1,2,...$. Then $g(u)$ is an odd function. Also $|\int_{0}^{\infty} g(u) du| < \infty$. One wayto see this is to note that $g(u)$ decays faster than the function $\exp\left\{-u\right\}$ and since$|\int_{0}^{\infty} \exp\left\{-u\right\} du| < \infty$, we conclude that $|\int_{0}^{\infty} g(u) du| < \infty$. Now,let $Z$ be a standard normal random variable. Then, we have$$EZ^{2k+1} = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} u^{2k+1}\exp\left\{-\frac{u^2}{2}\right\} du=0,$$for all $k \in \{0,1,2,..,\}$. Thus, we have shown that for a standard normal random variable $Z$, we have$$EZ=EZ^3=EZ^5=....=0.$$In particular, the standard normal distribution has zero mean. This is not surprising as we can seefrom Figure 4.6 that the PDF is symmetric around the origin, so we expect that $EZ=0$. Next, let's find $EZ^2$.

$EZ^2$$= \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} u^2\exp\left\{-\frac{u^2}{2}\right\} du$
$= \frac{1}{\sqrt{2 \pi}}\bigg[ -u\exp\left\{-\frac{u^2}{2}\right\}\bigg]_{-\infty}^{\infty} +$
$+\frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} \exp\left\{-\frac{u^2}{2}\right\} du \hspace{20pt} (\textrm{integration by parts})\\$
$= \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi}} \exp\left\{-\frac{u^2}{2}\right\} du$

The last equality holds because we are integrating the standard normal PDF from $-\infty$ to $\infty$. Thus,we conclude that for a standard normal random variable $Z$, we have$$\textrm{Var}(Z)=1.$$So far we have shown the following:

If $Z \sim N(0,1)$, then $EZ=0$ and Var$(Z)=1$.

CDF of the standard normal

To find the CDF of the standard normal distribution, we need to integrate the PDF function.In particular, we have$$F_Z(z)=\frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{z}\exp\left\{-\frac{u^2}{2}\right\} du.$$This integral does not have a closed form solution. Nevertheless, because of the importance ofthe normal distribution, the values of $F_Z(z)$ have been tabulated and many calculators andsoftware packages have this function. We usually denote the standard normal CDF by $\Phi$.

The CDF of the standard normal distribution is denoted by the $\Phi$ function:$$\Phi(x)=P(Z \leq x)= \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{x}\exp\left\{-\frac{u^2}{2}\right\} du.$$

As we will see in a moment, the CDF of any normal random variable can be written in terms of the$\Phi$ function, so the $\Phi$ function is widely used in probability. Figure 4.7 shows the $\Phi$ function.

Here are some properties of the $\Phi$ function that can be shown from its definition.

  1. $\lim \limits_{x\rightarrow \infty} \Phi(x)=1, \hspace{5pt} \lim \limits_{x\rightarrow -\infty} \Phi(x)=0$;
  2. $\Phi(0)=\frac{1}{2}$;
  3. $\Phi(-x)=1-\Phi(x)$, for all $x \in \mathbb{R}$.

Also, since the $\Phi$ function does not have a closed form, it is sometimes useful to use upper or lowerbounds. In particular we can state the following bounds (see Problem 7 in the Solved Problems section).For all $x \geq 0$,$$\hspace{50pt} \frac{1}{\sqrt{2\pi}} \frac{x}{x^2+1} \exp\left\{-\frac{x^2}{2}\right\} \leq 1-\Phi(x) \leq\frac{1}{\sqrt{2\pi}} \frac{1}{x} \exp\left\{-\frac{x^2}{2}\right\} \hspace{50pt} (4.7)$$

As we mentioned earlier, because of the importance of the normal distribution, the valuesof the $\Phi$ function have been tabulated and many calculators and software packages have thisfunction. For example, you can use the normcdf command in MATLAB to compute $\Phi(x)$ for agiven number $x$. More specifically, $normcdf(x)$ returns $\Phi(x)$. Also, the function $norminv$ returns$\Phi^{−1}(x)$. That is, if you run $x=norminv(y)$, then $x$ will be the real number for which $\Phi(x) = y$.

Normal random variables

Now that we have seen the standard normal random variable, we can obtain any normal random variableby shifting and scaling a standard normal random variable. In particular, define$$X=\sigma Z+\mu, \hspace{20pt} \textrm{where }\sigma > 0.$$Then$$EX=\sigma EZ+\mu=\mu,$$$$\textrm{Var}(X)=\sigma^2 \textrm{Var}(Z)=\sigma^2.$$We say that $X$ is a normal random variable with mean $\mu$ and variance $\sigma^2$. We write$X \sim N(\mu, \sigma^2)$.

If $Z$ is a standard normal random variable and $X=\sigma Z+\mu$, then $X$ is a normal random variablewith mean $\mu$ and variance $\sigma^2$, i.e,$$X \sim N(\mu, \sigma^2).$$

Conversely, if $X \sim N(\mu, \sigma^2)$, the random variable defined by $Z=\frac{X-\mu}{\sigma}$ is astandard normal random variable, i.e., $Z \sim N(0,1)$. To find the CDF of $X \sim N(\mu, \sigma^2)$, we can write

$F_X(x)$$=P(X \leq x)$
$=P( \sigma Z+\mu \leq x) \hspace{20pt} \big(\textrm{where }Z \sim N(0,1)\big)$
$=P\left(Z \leq \frac{x-\mu}{\sigma}\right)$

To find the PDF, we can take the derivative of $F_X$,

$f_X(x)$$=\frac{d}{dx} F_X(x)$
$=\frac{d}{dx} \Phi\left(\frac{x-\mu}{\sigma}\right)$
$=\frac{1}{\sigma} \Phi'\left(\frac{x-\mu}{\sigma}\right) \hspace{20pt} \textrm{(chain rule for derivative)}$
$=\frac{1}{\sigma} f_Z\left(\frac{x-\mu}{\sigma}\right)$
$=\frac{1}{\sigma\sqrt{2 \pi} } \exp\left\{-\frac{(x-\mu)^2}{2\sigma^2}\right\}.$

If $X$ is a normal random variable with mean $\mu$ and variance $\sigma^2$, i.e, $X \sim N(\mu, \sigma^2)$, then$$f_X(x)=\frac{1}{ \sigma\sqrt{2 \pi}} \exp\left\{-\frac{(x-\mu)^2}{2\sigma^2}\right\},$$$$F_X(x)=P(X \leq x)=\Phi\left(\frac{x-\mu}{\sigma}\right),$$$$P(a < X \leq b)= \Phi\left(\frac{b-\mu}{\sigma}\right)-\Phi\left(\frac{a-\mu}{\sigma}\right).$$

Figure 4.8 shows the PDF of the normal distribution for several values of $\mu$ and $\sigma$.

Let $X \sim N(-5,4)$.

  1. Find $P(X < 0)$.
  2. Find $P(-7 < X < -3)$.
  3. Find $P(X > -3 | X >-5)$.
  • Solution
    • $X$ is a normal random variable with $\mu=-5$ and $\sigma=\sqrt{4}=2$, thus we have

      1. Find $P(X < 0)$:
        $P(X < 0)$$=F_X(0)$
        $=\Phi(2.5)\approx 0.99$

      2. Find $P(-7 < X < -3)$:
        $P(-7 < X < -3)$$=F_X(-3)-F_X(-7)$
        $=2\Phi(1)-1 \hspace{20pt} \big(\textrm{since }\Phi(-x)=1-\Phi(x)\big)$
        $\approx 0.68$

      3. Find $P(X > -3 | X > -5)$:
        $P(X > -3 | X > -5)$$=\frac{P(X > -3,X > -5)}{P(X > -5)}$
        $=\frac{P(X > -3)}{P(X > -5)}$
        $\approx \frac{0.1587}{0.5} \approx 0.32$

An important and useful property of the normal distribution is that a linear transformation of a normalrandom variable is itself a normal random variable. In particular, we have the following theorem:

If $X \sim N(\mu_X, \sigma_X^2)$, and $Y=aX+b$, where $a,b \in \mathbb{R}$, then $Y \sim N(\mu_Y, \sigma_Y^2)$ where$$\mu_Y=a\mu_X+b, \hspace{10pt} \sigma^2_Y=a^2 \sigma_X^2.$$


We can write$$X =\sigma_X Z+ \mu_X \hspace{20pt} \textrm{where } Z \sim N(0,1).$$Thus,

$=a(\sigma_X Z+ \mu_X)+b$
$=(a \sigma_X) Z+ (a\mu_X+b).$

Therefore,$$Y \sim N(a\mu_X+b, a^2 \sigma^2_X).$$



What is the normal distribution of random variables? ›

A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known.

What are the limitations of the normal distribution? ›

Limitations of Normal Distribution

Real-world data often has skewness or kurtosis that deviates from that of a normal distribution. This means that the normal distribution may not accurately describe the behavior of real-world data, especially if it is skewed or has extreme values (outliers).

What type of random variable will have a normal distribution? ›

A continuous random variable Z is said to be a standard normal (standard Gaussian) random variable, shown as Z∼N(0,1), if its PDF is given by fZ(z)=1√2πexp{−z22},for all z∈R. The 1√2π is there to make sure that the area under the PDF is equal to one.

What requirements are necessary for a normal probability distribution? ›

The normal distribution is the proper term for a probability bell curve. In a normal distribution, the mean is zero and the standard deviation is 1. It has zero skew and a kurtosis of 3.

How do you know if a random variable is normally distributed? ›

A variable that is normally distributed has a histogram (or "density function") that is bell-shaped, with only one peak, and is symmetric around the mean.

How to tell if data is normally distributed? ›

A histogram is an effective way to tell if a frequency distribution appears to have a normal distribution. Plot a histogram and look at the shape of the bars. If the bars roughly follow a symmetrical bell or hill shape, like the example below, then the distribution is approximately normally distributed.

Why is normal distribution not good? ›

In certain cases, normal distribution is not possible especially when large samples size is not possible. In other cases, the distribution can be skewed to the left or right depending on the parameter measure. This is also a type of non-normal data that follows Poisson's distribution independent of the sample size.

What is the misuse of normal distribution? ›

The commonest misuse here is to assume that somehow the data must approximate to a normal distribution, when in fact non-normality is much more common. For example, if length is normally distributed, and weight is related to it by an allometric equation, then weight cannot be normally distributed.

What is an example of a normally distributed random variable? ›

Height, birth weight, reading ability, job satisfaction, or SAT scores are just a few examples of such variables. Because normally distributed variables are so common, many statistical tests are designed for normally distributed populations.

What are the 5 characteristics of a normal distribution? ›

Characteristics of Normal Distribution

Normal distributions are symmetric, unimodal, and asymptotic, and the mean, median, and mode are all equal. A normal distribution is perfectly symmetrical around its center. That is, the right side of the center is a mirror image of the left side.

What is the standard normal random variable? ›

Definition: standard normal random variable

A standard normal random variable is a normally distributed random variable with mean μ=0 and standard deviation σ=1. It will always be denoted by the letter Z. The density function for a standard normal random variable is shown in Figure 5.2. 1.

What conditions must be met to use a normal distribution? ›

Answer and Explanation: The criteria that must be met in order to use the Normal Distribution: 1) Mean, median, and mode of the distribution should be equal. For a distribution to be symmetric, mean, median and mode of the distribution should be equal.

What are the criteria for a normal distribution? ›

A normal distribution has some interesting properties: it has a bell shape, the mean and median are equal, and 68% of the data falls within 1 standard deviation.

What is prerequisite for normal distribution? ›

In fact, only two parameters are required to describe a normal distribution: the mean and the standard deviation.

What is the normal distribution of random numbers? ›

A distribution of values that cluster around an average (referred to as the “mean”) is known as a “normal” distribution. It is also called the Gaussian distribution (named for mathematician Carl Friedrich Gauss) or, if you are French, the Laplacian distribution (named for Pierre-Simon Laplace).

What is the normality of a random variable? ›

A Normal random variable has a continuous distribution over the sample space of all numbers, negative or positive. We denote the Normal distribution via “X∼Normal(μ,σ2) X ∼ N o r m a l ( μ , σ 2 ) ”, where μ=\Expec(X) is the expectation of the random variable and σ2=\Var(X) σ 2 = \Var ( X ) is it's variance16.

What is the distribution of a random variable? ›

The probability distribution for a random variable describes how the probabilities are distributed over the values of the random variable. For a discrete random variable, x, the probability distribution is defined by a probability mass function, denoted by f(x).

