Statistical Distributions


Statistics is a solid tool for performing the art of Data Science. Generally anyone can say that statistics is a mathematical body that pertains   collecting, analyzing, interpreting and drawing conclusions from the gained information.

With statistics we get to work with  data in  more informative and targeted way  than obtaining from the basic visualization charts.

The math involved statistics helps us  to tackle  conclusions about our data rather than just estimating.

Statistics in short is study about the data  were we can get deeper knowledge and find more insights and also how the data is been structured .

Statistics mainly involves differential statistics (It is a summary statistic that quantitatively describes or summarizes features of a collected Information) and inferential statistics (techniques used to obtain probability making decisions and to find accurate predictions)

Statistical Quantities

The five basic statistical qualities mainly involve Mean, Median, Mode, Variance and standard deviation.

Mean, Median, Mode are also called Central Tendency. Central tendency (or measure of central tendency) is a central or typical value for a probability distribution.

It may also be called a center or location of the distribution.

A central tendency can be calculated for either a finite set of values or for a theoretical distribution, such as the normal distribution.

Mean: Arithmetic mean (or simply, mean) is the sum of all measurements divided by the number  of observations in the data set

Median: The middle value that separates the higher half from the lower half of the data set. The   median and the mode are the only measures of central tendency that can be used for ordinal data, in   which values are ranked relative to each other   but are not measured absolutely as it is also   known as   50th percentile

Mode: The mode of a set of data values is the value that appears most often. It is the value x at   which its probability mass function takes its maximum value.

Variance: It measures how far a set of numbers are spread out from their mean. It is   calculated by   taking the differences between each number in the set and the mean, squaring the   differences and   dividing the sum of the squares by the number of values in the set.

Standard Deviation: It tells you how much data deviates from the actual mean. It   is the square   root of the Variance. 

A low standard deviation indicates that the data points tend to be close to the   mean, while a high standard deviation indicates that the data points are spread out over a   wider   range of values.

A useful property of the standard deviation is that, unlike the variance, it is   expressed in the same units as the data.

Discrete Vs Continuous

Discrete variables are countable in a finite amount of time. For Example, the total number of  students in the class, or the amount you deposited in the bank account as they all are still countable

Continuous data technically have an infinite number of steps. For Example, A person’s height  could be any value (within the range of human heights), not just certain fixed heights.


Fitting the Right Distribution

When confirmed with what data needs to be characterized by the distributed, it is always best to start with the raw data trying to fit the right distribution to that data.

As it mainly satisfies basic questions that can help in the characterization. The first is checks the data to the either discrete or continuous.

The second looks for the symmetry and if there is any asymmetry in other words the positive or negative outliers are equal or likely more than the other.

The third relates to the upper and lower limits of data relates to the likelihood of observing extreme values in the distribution; in some data, the extreme values occur very infrequently whereas in others, they occur more often.

The Poisson Distribution

History: Poisson Distribution   was named after French mathematician Siméon Denis Poisson

Description: The Poisson distribution is used to calculate the number of events that might occur in a continuous time interval.

The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume. 

For instance, how many Emails might occur at any time where the key parameter that is required is the average number of events in the given interval.

The resulting distribution looks like the binomial, with the skewness being positive but decreasing with l.  

Probability of events for a Poisson distribution:


l     lambda  is the average number of events per interval

         E     is the number 2.71828.

k      takes values 0, 1, 2, …

k!     k × (k − 1) × (k − 2) × … × 2 × 1 is the factorial of k.

This equation is the probability mass function (PMF) for a Poisson distribution

Poisson Distribution Example

The average number of homes sold by the Acme Realty company is 2 homes per day. What will be the probability that exactly 3 homes will be sold tomorrow?

This is a Poisson experiment in which we know the following:

 μ = 2; since 2 homes are sold per day, on average.

x = 3; since we want to find the likelihood that 3 homes will be sold tomorrow.

e = 2.71828; since e is a constant equal to approximately 2.71828.


P(x; μ) = (e-μ) (μx) / x!

P(3; 2) = (2.71828-2) (23) / 3!

P(3; 2) = (0.13534) (8) / 6

P(3; 2) = 0.180

Thus, the probability of selling 3 homes tomorrow is 0.180 .

Binomial Distribution

History: Swiss mathematician Jakob Bernoulli, determined the probability of k such outcomes.

Description: The binomial distribution measures the probabilities of the number of successes over a given number of trials with a specified probability of success in each try it is a statistical experiment with consists n repeated trials.

Each trial can result in just two possible outcomes known as  success and the other, a failure.

The probability of success, denoted by P, is the same on every trial. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution.

The binomial distribution is the basis for the popular binomial test of statistical significance.

Binomial Formula

Suppose a binomial experiment consists of n trials and results in x successes. If the probability of success on an individual trial is P, then the binomial probability is:

b(x; n, P) = nCx * Px * (1 – P)n – x


b(x; n, P) = { n! / [ x! (n – x)! ] } * Px * (1 – P)n – x


x        Number of successes that result from the binomial experiment.

N        Number of trials in the binomial experiment

P         Probability of success on an individual trial

Q        Probability of failure on an individual trial equal  to  1 – P

n!       The factorial of n

b(x; n, P)  Binomial probability

nCr       Number of combinations

Binomial Distribution Example:

Suppose a die is tossed 5 times. What is the probability of getting exactly 2 fours?

This is a binomial experiment in which the number of trials is equal to 5, the number of successes is equal to 2, and the probability of success on a single trial is 1/6 or about 0.167. Therefore, the binomial probability is:

b(2; 5, 0.167) = 5C2 * (0.167)2 * (0.833)3

b(2; 5, 0.167) = 0.161

Negative Binomial Distribution: Assume that the number of successes  is fixed at a given number and estimate the number of tries is obtained before reaching the specified number of successes.

The resulting distribution is called the negative binomial and it very closely resembles the Poisson.

In fact, the negative binomial distribution converges on the Poisson distribution, but will be more skewed to the right (positive values) than the Poisson distribution with similar parameters.

Uniform Distribution

Description:  A uniform distribution, sometimes also known as a rectangular distribution, is a distribution that has constant probability.

On rolling a fair die, the outcomes are 1 to 6. The probabilities of getting these outcomes are equally likely and that is the basis of a uniform distribution.

Unlike Bernoulli Distribution, all the n number of possible outcomes of a uniform distribution are equally likely.

F(x) = 1/ b-a

Where ,

X       uniformly distributed if the density function

a , b    Parameters


This distribution has two types.

The most common type you’ll find in elementary statistics is the continuous uniform distribution and second is the discrete uniform distribution though it  resembles as a rectangle but instead of a line, a series of dots are represented for the finite number of outcomes.

Example: Rolling a single die is example of a discrete uniform distribution it produces   four possible outcomes: 1,2,3,4,5, or 6. There is a 1/6 probability for each number being rolled.

Uniform Distribution Example:

Let metro trains on a certain line run every half hour between mid  night and six in the morning.

What will be the probability that a man entering the station at a random time during this period will have to wait at least twenty minutes.

Here, Let x denotes the waiting time (in minutes) for the next train, under the assumption that a man arrives at random at the station.

X is distributed uniformly on (0,30) with probability distributed function

f(x)= 1/30 , 0<x<30          = 0

The probability that he has to wait at least 20 minutes is

P(X.20)  = 1 /30∫1.dx 20 to 30

= 1/30(30-20)

= 1/3

Normal Distribution

History :  de Moivre developed the normal distribution as an approximation to the binomial distribution, and it was subsequently used by Laplace in 1783 to study measurement errors and by Gauss in 1809 in the analysis of astronomical data

Description:  Normal distributions are common continuous probability distributions  as it is highly important in statistics   and are often used in natural and social studies to predict the real value random variables whose values are not known, The normal distribution, also known as the Gaussian distribution, is symmetric about the mean, differentiates data near the mean are more frequent in occurrence than data far from the mean

The Normal Equation.

Y = { 1/[ σ * sqrt(2π) ] } * e-(x – μ)2/2σ2

Where ,

X      normal random variable

μ       mean,

σ        standard deviation

π        approximately 3.14159

e        approximately 2.71828.

Normal Distribution Example.

If the  data is normally distributed  the   mean and standard deviation can be calculated  ,Here if the mean is halfway between 1.2m and 1.8m

Mean = (1.2m + 1.8m) / 2 = 1.5m

95%  for  4 standard deviations(2 Standard Deviations on either side)

one standarad deviation  = (1.8m – 1.2m)/4

                                                                             =  0.6m /4

                                                                            =  0.15m

The  mean and standard deviation for the normally distributed data is obtained as 1 .5m  and 0.15m(one standard deviation).

Leverge your Biggest Asset Data

Inquire Now


Statistical Distributions are widely used in many sectors, like Computer Science, Science, Finance, Insurance ,Engineering , Medical, Stock Market  and the day to day life .

The key for good data analytics is obtained by fitting the right distribution  to the data and  preserving the best Estimation.

The above Distributions   are observed and used in day to day life as it can be related and compared  and analyzed with other distributions.

Author: Sujithra S
Sujithra is a Data Science professional, who is passionate about Data Science and Machine learning. She has experience in building Machine Learning models in Python and Conversational AI chatbot Development.