How to incorporate statistical analysis into Data Science — Part-2 (Condition Probability, Bayes Therom, and Likelihood ratios)

Gowtam Singulur
4 min readMay 24, 2021

The probability of a particular event in the real world is non-trivial and may change given the outcomes of dependent events. For example, the probability of an accident on an empty road is pretty low. However, it’s much more prominent if it’s raining cats and dogs, and the visibility is low. In this blog, we will discuss conditional probability and related topics.

Before discussing Conditional probability, we need to know about Dependent and Independent Events.

If two events A and B are considered independent, if P(A∩B) = P(A)*P(B) and if it’s not, A and B are dependent events.

What is Condition Probability?

Let B is an event with P(B)>0, p(B) should be greater than zero because there is no sense in conditioning on an event that can’t happen. So, the probability of event A given event B is represented as P(A|B).

P(A|B) = P(A∩B)/P(B)

So, what if events A and B are independent?

P(A|B) = P(A∩B)/P(B) = (P(A)* P(B))/P(B) = P(A)

Let’s test it with 2 examples.

Example 1: We rolled an unbiased die. Event A — rolling a “2” and Event B — rolling an even number (2, 4, 6). So, what is P(A|B)?

Venn diagram of events A and B in Example-1

From this Venn diagram, we can conclude that P(A∩B) = P(A).

So, P(A|B) = P(A)/P(B) = (1/6)/(3/6) = 1/3.

Example 2: We rolled 2 unbiased dice. Event A — rolling a odd number with die 1 and Event B — rolling an even number with die 2. So, what is P(A|B)?

Venn diagram of events A and B in Example-2

From this Venn diagram, we can conclude that P(A∩B) = P(A)*P(B) as both events A and B are independent.

So, P(A|B) = (P(A)*P(B))/P(B) = P(A) = 1/2.

What is Bayes Theorem?

Bayes Theorem is one of the most prominent implementations of conditional probability.

P(A|B) = P(B|A)P(A)/P(B)

There are two explanations of the Bayes Theorem. Let’s start with a practical one.

Bayes Theorem lets you calculate the probability of a particular event more accurately based on the prior knowledge of conditions related to the event.

Let’s understand the above statement through an example. If we want to calculate the probability of a person suffering from cancer, we would initially output the percentage of people suffering from cancer. But if the person is an avid smoker, we need to update our previous calculation because the probability will be much higher in this case. So, the Bayes theorem utilizes prior knowledge to refine our probability calculations.

Theoretical definition of Bayes theorem:

Bayes theorem allows us to reverse the conditioning set and the set we want to calculate the probability.

Let’s say we want to calculate P(A|B). Bayes theorem says we can calculate it using P(B|A), P(A), and P(B), as shown below.

P(A|B) = P(B|A)P(A)/P(B)

Let’s define the terms in the above formula using the practical example stated above: “Calculate the probability of a patient having cancer given he is a smoker.”

P(A|B): This is called posterior, and it’s what we are calculating. From the example, it represents the “probability of a patient having cancer given he is a smoker.”

P(B|A): This is called likelihood, and it’s the probability of the prior condition given the initial hypothesis. From the example, it represents the “probability of a patient being a smoker if he is suffering from cancer.”

P(A): This is called prior, and it’s the probability of just hypothesis without any additional info. From the example, it represents the “probability of a patient suffering from cancer.”

P(B): This is called marginal likelihood, and it’s the probability of the prior condition. From the example, it represents the “probability of a patient being a smoker.”

Likelihood ratios

Before talking about likelihood ratios, we need to define “Odds.” The odds of an event is defined as the probability that the event will occur divided by the probability that the event will not happen.

For example, when we roll unbiased dice, what are the odds of rolling snake eyes (pair of ones)?

P(snake eyes) = (1/6)(1/6) = 1/36.

odds of rolling snake eyes = P(snake eyes) / (1-P(snake eyes)) = (1/36)/(1-(1/36)) = (1/36)/(35/36)= 1/35.

This means that the odds of rolling a snake eyes is 1 in every 35 times.

Let’s define the likelihood ratios using our cancer example. Prior to that, let’s define few events.

D: Patient has cancer

D`: Patient doesn’t have cancer

S: Patient has a smoking habit

Let’s calculate the odds of patient having cancer who is a smoker.

calculating the odds of patient having cancer who is a smoker

In the above equation, there are quite a few interesting observations:

P(D|S)/P(D`|S) is the odds of a patient having cancer after knowing they smoke.

P(D)/P(D`) is the odds of patient having cancer without any prior knowledge.

P(S|D)/P(S|D`) is the diagnostic likelihood ratio(DLR-S) of patient being a smoker.

Post-smoke odds of D = DLR-S * Pre-smoke odds of D

To summarise, the hypothesis of a person who smokes having cancer is DLR-S times more supported by the data than the hypothesis of a person who smokes not having cancer.

This is the end of part-2. In the next part, I will talk about Expected values, Distributions, Asymptotics, and Confidence Intervals.

References:

Drink coffee and keep on learning

--

--