Hypothesis Testing — a kutty(short) story

Paddy
Analytics Vidhya
Published in
5 min readMar 6, 2021

--

To understand Hypothesis testing a quick refresher on CLT will be more helpful

What is Hypothesis Testing?

To know about population, we wont have the population data (usually) . so we collect sample and make conclusions from it. Now how do we ensure our hypothesis or conclusions is correct . We need to test it and so we perform Hypothesis testing

First step in Hypothesis testing create 2 type:

  1. Null hypothesis (H0) : This is the status quo.
  2. Alternate Hypothesis : The challenge to status quo

In the CLT example we were calculating the commute time of employees. In that case

  1. Null Hypothesis will be the average commute time of all employees is 35 minutes. H0 = 35.
  2. Alternate Hypothesis will not be the average commute time of an employee is 35 minutes. H0 ≠ 35

We arrive on 2 conclusions out of this Hypothesis testing.

  1. We reject the null hypothesis when H0 ≠ 35
  2. We failed to reject the null Hypothesis when H0 = 35 (Remember we wont accept Accept Alternate Hypothesis because that needs more evaluation)

There are 3 types of Test

≠ in H₁ → Two-tailed test → Rejection region on both sides of distribution

< in H₁ → Lower-tailed test → Rejection region on left side of distribution

> in H₁ → Upper-tailed test → Rejection region on right side of distribution

The above example of H1 ≠ 35 is a 2 tailed test

The H1 > 35 will be a upper tail test

The H1 < 35 is a lower tail test.

We can evaluate the Hypothesis testing by 2 popular methods

a) Critical value

We can see this with an example if a Medical device manufacturer claims its life time is 120 months on average. There was test conducted against this claim by an agency. They took 50 devices and found the lifetime as 125 months with Standard deviation as 10. If significance level is not provided we can assume it as 5%

H0 = 120

H1 ≠ 120

Formula for performing critical value test is μ ± Zc *(σ/​√N​)

Step 1 : Calculate Zc critical score . It is a 2 tail test, For 5%

1- (0.05/2) = .975

Side note in case if this is a upper or lower tail test we would have got (1-0.05 = .95)

Z-score of 0.975 is = 1.96 (ref Z table)

Step 2: Calculate standard error = σ/​√N

σ = 10

N = 50

hence standard error = 7.07

Zc *(σ/​√N​) = 1.96*7.07 = 13.85

Step 3: Mean±Error

UCV or upper critical value : 120+13.85 = 133.85

LCV or Lower critical value : 120–13.85=106.15

The mean of population is between 106.15 and 133.85. the sample mean which we have got is 125 which is between UCV or LCV. Hence we failed to reject the Null Hypothesis. H0=115

To summarise what we had done we calculate the sample mean and check if it falls within another critical range.

b) P-value method:

Assume this is a another was testing our assumption Lets take the same problem and test it with p-value method.

The formula for calculating p-value = (​¯x​ — μ) / (σ /​√N​).

The same Problem statement

We can see this with an example if a Medical device manufacturer claims its life time is 120 months on average. There was test conducted against this claim by an agency. They took 50 samples and found the lifetime as 125 months with Standard deviation as 10. If significance level is not provided we can assume it as 5%

Step 1: (​¯x​ — μ) = 125–120 = 5

Step 2: σ = 10, √N​ = √50

σ /​√N​ = 7.07

(​¯x​ — μ) / (σ /​√N​) = 0.7072

Step 3 : Calculate Z score = .75804

Since it is a two tailed test 2 * (1-.75804) = 0.48392

Side note: If it was a one sided test we would not have multiplied by 2 i.e., (1-.75804 = .24196)

As 0.48392 > 0.05 (significance level 5%) we failed to reject the Null Hypothesis. Easy way to remember this is when P low null Go. In this case p is 0.05 which is higher than 0.48392

Both the types of testing has arrived on our assumption as correct.

Bonus Read:

We may not always be correct with our Hypothesis testing. We may make wrong decisions. We may end up with 2 types of error:

Type 1 Error: When we reject Null hypothesis when the null Hypothesis is true. This is denoted by alpha Say

H0 = accused is not guilty

H1 = accused is guilty

When a person is not guilty and he was accused as guilty which is a type 1 error. In this specific case it is expected to minimal

Type 2 Error:

When a When we failed to reject a null Hypothesis when the null Hypothesis is False. this is denoted by Beta

Example: When a person is guilty and he was accused as not guilty which is a type 2 error. In this specific case it is expected to be moderate.

Thanks for reaching till the end. Happy learning

--

--