# How to do Hypothesis testing in statistics

Hypothesis testing is a statistical procedure used to evaluate the validity of a hypothesis. It is a tool that allows researchers to make decisions about a population based on sample data.

The process of hypothesis testing involves formulating a hypothesis about a population and collecting a sample of data from that population. The hypothesis is then tested by comparing the sample data to the hypothesis. If the sample data supports the hypothesis, the hypothesis is considered to be supported. If the sample data does not support the hypothesis, the hypothesis is considered to be refuted.

There are two types of hypotheses in hypothesis testing: the null hypothesis and the alternative hypothesis. The null hypothesis is a statement that there is no relationship between two variables or that the observed difference between two groups is due to chance. The alternative hypothesis is a statement that there is a relationship between two variables or that the observed difference between two groups is not due to chance.

In hypothesis testing, the null hypothesis is assumed to be true until it is proven to be false. The alternative hypothesis is tested by collecting sample data and using statistical tests to determine the likelihood that the observed difference between two groups is due to chance. If the likelihood is low, the null hypothesis is rejected and the alternative hypothesis is accepted.

In hypothesis testing, a type 1 error occurs when the null hypothesis is rejected when it is actually true. This error is also known as a false positive or alpha error. The probability of a type 1 error occurring is represented by the alpha level, which is set by the researcher before the hypothesis test is conducted. The alpha level is the maximum acceptable probability of a type 1 error occurring.

A type 2 error occurs when the null hypothesis is not rejected when it is actually false. This error is also known as a false negative or beta error. The probability of a type 2 error occurring is represented by the beta level, which is determined by the sample size and the effect size. The smaller the sample size or the larger the effect size, the lower the probability of a type 2 error occurring.

It is important to minimize the probability of both type 1 and type 2 errors in hypothesis testing. To do this, researchers typically set the alpha level at a low value (such as 0.05 or 0.01) to reduce the probability of a type 1 error occurring. They may also use a larger sample size or a more powerful statistical test to reduce the probability of a type 2 error occurring.

In summary, a type 1 error occurs when the null hypothesis is rejected when it is actually true, and a type 2 error occurs when the null hypothesis is not rejected when it is actually false. Both types of errors can have serious consequences, so it is important to carefully consider the alpha and beta levels and use appropriate statistical tests and sample sizes in hypothesis testing.

There are many different statistical tests that can be used in hypothesis testing, depending on the nature of the data and the research question being investigated. Some common statistical tests include t-tests, ANOVA, and chi-square tests.

## The steps in hypothesis testing are as follows:

1. Formulate the null and alternative hypotheses: The first step in hypothesis testing is to formulate the null and alternative hypotheses. The null hypothesis is a statement that there is no relationship between two variables or that the observed difference between two groups is due to chance. The alternative hypothesis is a statement that there is a relationship between two variables or that the observed difference between two groups is not due to chance.
2. Select the alpha level: The alpha level is the maximum acceptable probability of a type 1 error occurring. It is typically set at a low value, such as 0.05 or 0.01.
3. Collect the sample data: The next step is to collect a sample of data from the population of interest. The sample size should be large enough to provide sufficient power to detect a statistically significant difference between the groups being compared.
4. Conduct the statistical test: Once the sample data has been collected, a statistical test is conducted to determine the likelihood that the observed difference between the groups is due to chance. The statistical test used will depend on the nature of the data and the research question being investigated.
5. Compare the p-value to the alpha level: The p-value is the probability of obtaining the observed results if the null hypothesis is true. If the p-value is less than the alpha level, the null hypothesis is rejected and the alternative hypothesis is accepted. If the p-value is greater than the alpha level, the null hypothesis is not rejected.
6. Interpret the results: If the null hypothesis is rejected, the results of the hypothesis test are considered statistically significant and the alternative hypothesis is accepted. If the null hypothesis is not rejected, the results of the hypothesis test are not considered statistically significant and the null hypothesis is retained.

It is important to carefully consider the alpha and beta levels and use appropriate statistical tests and sample sizes in hypothesis testing to minimize the probability of type 1 and type 2 errors occurring.

## Here is an example of hypothesis testing:

Suppose a researcher is interested in determining whether there is a relationship between sleep duration and grades in college students. The researcher formulates the following hypotheses:

Null hypothesis: There is no relationship between sleep duration and grades in college students.

Alternative hypothesis: There is a relationship between sleep duration and grades in college students.

The researcher decides to set the alpha level at 0.05. This means that the researcher is willing to accept a 5% probability of a type 1 error occurring.

The researcher collects a sample of 100 college students and asks them to report their average sleep duration and their grades. The researcher then conducts a statistical test to determine the likelihood that the observed relationship between sleep duration and grades is due to chance.

Suppose the p-value for the statistical test is 0.03. Since the p-value is less than the alpha level of 0.05, the null hypothesis is rejected and the alternative hypothesis is accepted. This means that the researcher can conclude that there is a statistically significant relationship between sleep duration and grades in college students.

In this example, the researcher has successfully used hypothesis testing to evaluate the validity of a hypothesis about a population. Hypothesis testing is an important tool in statistical analysis and is used in many different fields to understand the relationships between variables and make decisions about populations based on sample data.