AB testing calculator

A/B testing for statistical significance calculation.

Is your A/B testing statistically significant?
Experiment with the settings and gain a deeper understanding of how decreasing the confidence level can enhance the effect size or how increasing the sample size can render a minor difference in the control and experimental groups statistically significant!
Pre-test or Post-test?
Data
A - Visitors
A - Conversions
B - Visitors
B - Conversions
Settings
Hypothesis ( i )
Do you wish to be confident whether the conversion rate of B is lower? If you choose 1-sided then no conclusive statement can be made if the conversion rate of B is lower than of A.
Confidence ( i )
The level of confidence you can have that your results are not due to random chance.
The test result is not significant.
Variation B showed a conversion rate of (2.06%), which was 2.55% higher than the conversion rate of Variation A (2.01%). With a 95% level of confidence, it can be concluded that this outcome was a direct consequence of the modifications made and not merely due to chance.
Possible SRM alert
Assuming you intented to have a 50% / 50% split, a Sample Ratio Mismatch (SRM) check indicates there might be a problem with your distribution.
Conversion Rate Control
Conversions A / Visitors A

            
Conversion Rate B
Conversions B / Visitors B

            
Relative uplift in Conv. Rate
CRB - CRA / CRA

            
Observed Power
 

            
p value
 

            
Z-score
( CRB - CRA ) / SEdifference

            
Standard error A
( CRA * (1-CRA ) / VisitorsA)1/2

            
Standard error B
( CRB * (1-CRB ) / VisitorsB)1/2

            
Std. Error of difference
SEdifference = ( SEA2 + SEB2 )1/2

            

Discover IdSurvey

The most powerful survey software, loved by professionals.

FAQ

What is A/B testing?

A/B testing, also known as split testing, is a method of comparing two versions of a webpage, advertisement, or other marketing asset to determine which version performs better. In an A/B test, two versions of a webpage are shown to different segments of website visitors at the same time, with one version being the control (original) and the other being the variation (test). The performance of each version is then compared based on specific metrics, such as conversion rates or click-through rates, to determine which version is more effective.

For example, an e-commerce website might create two versions of a product page, one with a blue “Buy Now” button and the other with a green “Buy Now” button. Visitors to the site are randomly assigned to one of the two versions, and their behavior is tracked. If the version with the green button generates more sales, it can be concluded that the green button is more effective at encouraging visitors to make a purchase.
A/B testing can be used for a wide range of marketing activities, including email campaigns, landing pages, pricing strategies, and product design. It is an effective way to optimize marketing efforts and improve conversion rates.

What is statistical significance?

Statistical significance is a term used in statistics to indicate whether an observed difference between two groups or a relationship between two variables is likely to be genuine or simply due to chance. In other words, it measures the likelihood that a particular result is not due to random variation.

Statistical significance is typically determined by calculating a p-value, which is the probability of obtaining a result as extreme or more extreme than the observed result, assuming that the null hypothesis is true. The null hypothesis is the assumption that there is no real difference or relationship between the groups or variables being compared.

If the p-value is very low (typically less than 0.05), it is considered statistically significant, meaning that it is unlikely that the observed result is due to chance. If the p-value is higher than 0.05, the result is not considered statistically significant, meaning that it is possible that the observed difference or relationship could be due to chance.

It is important to note that statistical significance does not necessarily mean practical significance or importance. A result may be statistically significant but have little practical importance, or vice versa. Additionally, statistical significance does not prove causation, as correlation does not necessarily imply causation.

How to calculate statistical significance with A/B testing?

To calculate statistical significance with A/B testing, you can use a statistical test such as the two-sample t-test or the z-test. These tests compare the means of two groups (control and variation) and determine whether the difference between them is statistically significant or not.
Here are the general steps for calculating statistical significance with A/B testing:

  1. Choose a statistical test: Depending on the size of your sample and the type of data you are analyzing, you can choose between a two-sample t-test or a z-test.
  2. Define your null and alternative hypotheses: The null hypothesis is that there is no significant difference between the control and variation groups, while the alternative hypothesis is that there is a significant difference.

  3. Collect data: Collect data from both the control and variation groups for a specific period of time.

  4. Calculate the test statistic: Use the chosen statistical test to calculate the test statistic (t-statistic or z-score).

  5. Determine the p-value: Calculate the p-value, which is the probability of observing a result as extreme as the one you obtained, assuming the null hypothesis is true.

  6. Compare the p-value with the significance level α. If the p-value is less than α, one can reject the null hypothesis and conclude that there is a significant difference or relationship between the variables of interest. Otherwise, one cannot reject the null hypothesis and conclude that the observed difference or relationship could be the result of chance or sampling error.

In general, calculating statistical significance requires basic knowledge of statistics and the use of specialized software tools or calculators, especially for more complex tests or multivariate analyses.

What is null hypothesis?

In statistics, the null hypothesis is a statement that assumes there is no significant difference between two groups or variables. It is usually denoted as “H0.”

The null hypothesis is used to test whether a particular result is due to chance or if it is statistically significant. For example, if you want to test whether a new drug is effective, the null hypothesis would be that there is no difference in effectiveness between the new drug and the old one.

To test the null hypothesis, researchers will collect data and perform statistical analysis to determine the probability that the observed results are due to chance. If the probability is very low (typically less than 5%), then researchers reject the null hypothesis and conclude that there is a significant difference between the groups or variables being compared.

It is important to note that rejecting the null hypothesis does not necessarily mean that the alternative hypothesis (i.e., the hypothesis being tested) is true. It simply means that there is sufficient evidence to suggest that the null hypothesis is not true.

What is p-value?

In statistics, the p-value is a measure of the evidence against a null hypothesis. It represents the probability of obtaining a result as extreme or more extreme than the one observed, assuming that the null hypothesis is true.

More specifically, the p-value is the probability of observing a test statistic (such as a t-statistic or a z-statistic) as extreme or more extreme than the one observed, given the null hypothesis. If the p-value is small (usually less than 0.05 or 0.01), it indicates that the observed result is unlikely to have occurred by chance alone and the null hypothesis can be rejected.

For example, if we are testing whether a new drug is effective, the null hypothesis would be that the drug has no effect. If we conduct a clinical trial and find a small p-value (e.g., 0.03), it suggests that the observed improvement in the treatment group is unlikely to have occurred by chance alone, and we can reject the null hypothesis and conclude that the drug is effective.

It is important to note that the p-value is not the probability that the null hypothesis is true or false, but rather the probability of obtaining the observed result assuming that the null hypothesis is true. Therefore, a small p-value does not necessarily mean that the alternative hypothesis is true, but only that there is strong evidence against the null hypothesis.

What is the statistical power of a test?

The statistical power of a test is the probability of correctly rejecting the null hypothesis when it is false. In other words, it is the probability of detecting a true effect or difference between two groups or variables.

A test with high statistical power is able to detect even small differences or effects, while a test with low statistical power is more likely to miss them.

Statistical power depends on several factors, including the sample size, the level of significance, and the effect size. Increasing the sample size or the level of significance can increase the statistical power of a test, while increasing the effect size (i.e., the magnitude of the difference between the groups or variables being compared) can also increase the statistical power.

It is important to consider the statistical power of a test when designing experiments or studies, as low statistical power can lead to false negative results (i.e., failing to detect a true effect) and reduce the overall reliability of the study. Therefore, researchers often conduct power analyses to estimate the required sample size and ensure that their study has adequate statistical power to detect the effects they are interested in.

Discover IdSurvey

The most powerful survey software, loved by professionals.