Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Probability Distribution 2

Random Variable

Random Variable maps the outcome of sample space into real numbers.

Example: How many heads when we toss 3 coins?

XX could be 0,1,20, 1, 2 or 3 randomly, and they might each have a different probability. XX = “The number of Heads” is the Random Variable.

In this case, there could be 0 Heads (if all the coins land Tails up), 1 Head, 2 Heads or 3 Heads. So the Sample Space = 0,1,2,3{0, 1, 2, 3} But this time the outcomes are NOT all equally likely. The three coins can land in eight possible ways:

Looking at the table we see just 1 case of Three Heads, but 3 cases of Two Heads, 3 cases of One Head, and 1 case of Zero Heads. So:

And this is what becomes the probability distribution.

Probability Distribution vs Frequency Distribution

Figure 1:Probability Distribution vs Frequency Distribution

Frequency distribution comes from actually doing the experiment nn number of times as n>n->\infty the shape comes closer and closer to the Probability distribution

Now the probability distribution can be of 2 types, discrete and continous. An example of Discrete is shown above.

Now depending on the problem type one can choose the corresponding distribution and find the probability for some value of the random variable.

Types of Distribution

Some common types of probability distribution are as follows:

Types of Probability Distribution

Figure 2:Types of Probability Distribution

Normal(Gaussian) Distribution

The normal distribution is the most important probability distribution in statistics because it fits many natural phenomena. For example, heights, blood pressure, measurement error, and IQ scores follow the normal distribution.

Despite the different shapes, all forms of the normal distribution have the following characteristic properties.

Mean +/- standard deviationsPercentage of data contained
168%
295%
399.7%

Measures to understand a distribution:

There are 3 variety of measures, required to understand a distribution:

Measure of Central Tendency

Measures of central tendencies are measures, which help you describe a population, through a single metric. For example, if you were to compare Saving habits of people across various nations, you will compare average Savings rate in each of these nations. Following are the measures of central tendency:

Measure of Central Tendency. Mean is typically affected the most by Outliers, followed by the median and mode.

Figure 3:Measure of Central Tendency. Mean is typically affected the most by Outliers, followed by the median and mode.

Measure of Dispersion

Measures of dispersion reveal how is the population distributed around the measures of central tendency.

2 distributions with different standard deviations

Figure 4:2 distributions with different standard deviations

Measure to describe shape of distribution

Box Plots

Box plots are one of the easiest and most intuitive way to understand distributions. They show mean, median, quartiles and Outliers on single plot.


Unbiased Estimator

An unbiased estimator is an accurate statistic that’s used to approximate a population parameter. “Accurate” in this sense means that it’s neither an overestimate nor an underestimate. If an overestimate or underestimate does happen, the mean of the difference is called a “bias.” That’s just saying if the estimator (i.e. the sample mean) equals the parameter (i.e. the population mean), then it’s an unbiased estimator.

Maximum Likelihood Estimation (MLE)

Say you have some data. Say you’re willing to assume that the data comes from some distribution -- perhaps Gaussian. There are an infinite number of different Gaussians that the data could have come from (which correspond to the combination of the infinite number of means and variances that a Gaussian distribution can have). MLE will pick the Gaussian (i.e., the mean and variance) that is “most consistent” with your data (the precise meaning of consistent is explained below).

So, say you’ve got a data set of y=1,3,7y={-1,3,7}. The most consistent Gaussian from which that data could have come has a mean of 3 and a variance of 16. It could have been sampled from some other Gaussian. But one with a mean of 3 and variance of 16 is most consistent with the data in the following sense: the probability of getting the particular yy values you observed is greater with this choice of mean and variance, than it is with any other choice.

Maximum Likelihood Estimation can be applied to both regression and classification problems.


Questions