Statistical significance is based on various parameters. Gleef uses an automatic calculation to determine whether your results are significant. For detailed insight on how this is done, refer to our statistics Q&A.

Continue running the experiment until you achieve statistical significance. Ending the experiment prematurely could hinder the reliability of your results. Remember, significance is closely related to the magnitude of the conversion difference between variations. For example, if you have a 10% conversion difference between two variations, your experiment may reach significance much faster than if the difference were only 1%.

All significance parameters can be customized for your company.

As a reminder, the confidence interval is calculated as follows:

Confidence LevelCritical Value, 𝑍𝑐
99%2.575
95%2.33
90%1.645
80%1.28

The console displays the critical Z value and the p-value, if needed.

The following default values are recommended for a fully significant experiment:

  • A confidence interval above or equal to 0.95
  • At least 250 conversions per variation;
  • At least 5,000 unique visitors per variation;
  • At least 14 days of experimentation (two business cycles);

Conversion & growth

At any time, you can access and monitor the results of your experiment, observing the conversion rates of each variation, and the improvement compared to the baseline, which represents the original wording being tested.

Hover over the conversion or growth metrics for detailed figures.

Common Questions About Statistics

What do I need to consider for a successful test?

  1. Traffic Volume:

    • High-traffic websites can quickly achieve the necessary sample sizes.
    • Low-traffic websites may need to run tests for extended periods to gather sufficient data.
  2. Test Duration:

    • Ensure the test runs long enough to collect the necessary sample size for accurate results.
    • Account for seasonal or time-based variations in traffic and user behavior.
  3. Multiple Variants:

    • If testing more than two variants, the sample size calculation is updated to maintain statistical validity.

Do I have enough traffic on my website to run A/B tests?

You’ll need approximately 7,448 visitors for each variant to detect a 1% difference with 95% confidence and 80% power.

Any website can run A/B tests; the key factor is how much time you allow the test to run without making changes to your website. It’s essential to ensure that your website has sufficient traffic to yield reliable and statistically significant results.

Key Considerations:

  1. Sample Size:

    • The number of visitors required to detect a meaningful difference between variants.
    • Calculated using statistical formulas based on your desired confidence level and effect size.
  2. Effect Size:

    • The minimum detectable change in conversion rate or other metrics.
    • Smaller effect sizes necessitate larger sample sizes.
  3. Confidence Level:

    • The probability that the test results are not due to random chance.
    • Commonly used confidence levels are 95% and 99%.

Statistical Formula for Sample Size The sample size n can be calculated using the following formula for two proportions: n = (2βˆ—((2 * (Z_alpha/2+/2 + Z_beta)2βˆ—)^2 * pβˆ—(1βˆ’* (1 -p))/()) / (delta2)^2)

Where:

  • Z_alpha/2 is the Z-value for the desired confidence level (e.g., 1.96 for 95% confidence).
  • Z_beta is the Z-value for the desired power (e.g., 0.84 for 80% power).
  • p is the estimated overall conversion rate.
  • delta is the minimum detectable effect size (difference between conversion rates of the control and variant).
To detect a 1% difference with 95% confidence and 80% power, you would need approximately 7,448 visitors per variant.
You would need approximately 75 visitors for each variant to detect a 10% difference with 95% confidence and 80% power.

What are some common examples of traffic and A/B testing statistics?

  1. High-traffic website:

    • Daily visitors: 50,000
    • Estimated conversion rate: 5%
    • Desired effect size: 1%
    • Sample size per variant: 7,448

    Conclusion: This test can be completed in a few days.

  2. Low-traffic website:

    • Daily visitors: 500
    • Estimated conversion rate: 5%
    • Desired effect size: 10%
    • Sample size per variant: 75

    Conclusion: This test can be completed within hours.

By ensuring you have enough traffic and using the correct calculations, you can run A/B tests that provide reliable and actionable insights.

How are variations split among visitors?

Currently, variations are equally distributed among visitors. For example, if you have three different variations in an experiment (including the baseline), one out of every three visitors will randomly see each variation.

How can I change the significance thresholds for my company?

You can contact the team at support@gleef.eu.

How can I segment the traffic and determine who sees an experiment?

Currently, we do not provide built-in segmentation capabilities. However, you can conduct your own segmentation by directing visitors to different URLs and implementing experiments on each specific URL.

Can I remove the default text?

To conduct effective A/B testing, you should keep the text you’re experimenting with as the baseline. This serves as the reference point for all other variations tested.

What kinds of success events can be set up?

We provide extensive information on success events on this page: Success events

The type of success event depends on the experiment you’re running:

  1. Running an experiment on a Call to Action (CTA) you can define the success event as either:

    1. A click on the specific item.
    2. A visit to a specific URL (a page view later in the funnel).
  2. Running an experiment on regular non-clickable text you can specify the relevant visit to a specific URL (a page view later in the funnel).

Retrieving Data from Experiments

We strive to provide all the essential information directly on your dashboard.

How do I interpret the results of an A/B experiment?

Compare the performance metrics (e.g., conversion rates) of each variation against the control to determine the most effective copy. Gleef highlights the best-performing variation directly below the β€˜baseline’, which represents the text you selected for the experiment.

Does Gleef provide detailed statistical information on experiments?

We now display p-value and Z-value for all experiments, as well as extensive data points we use to compute significance.
We aim to deliver all necessary statistical data to provide a comprehensive overview of your experiments. Our intention is to keep Gleef smooth and easy to operate by avoiding information overload. However, if you need more detailed statistics, please contact us at support@gleef.eu, and we will be happy to provide additional information.

Is it possible to download full experiment data?

While we currently do not offer a built-in option for downloading complete experiment data, we can send you a comprehensive data export file with all necessary details upon request. Simply email us at support@gleef.eu.

How do I apply the best performing wording to my website?

Gleef does not currently offer the capability to directly implement changes on your website. Therefore, you will need to follow your internal processes to submit the wording changes based on the best-performing variation.

Was this page helpful?