When to use multivariate experiments

Share this article

Are you wondering when you should use multivariate testing (also known as factorial experiments) rather than standard AB tests? Or maybe you want to hear more about the main advantages of multivariate testing? Well either way, you are in the right place! In this article, we tell you everything you need to know to decide whether or not to use multivariate experiments.

We start out by talking about how multivariate experiments are designed. After that, we take some time to discuss how multivariate experiments are analyzed. Next, we talk about some of the main advantages and disadvantages of multivariate experiments. We follow this up with a discussion of situations where it is a good idea to run multivariate experiments. Finally, we discuss situations when you should not run multivariate experiments.

How to design multivariate experiments?

How do you design a multivariate experiment? In general, multivariate experiments are used when there are multiple different factors (or facets of an experience) that you want to modify in the same experiment. Each of these factors generally has multiple levels, or multiple individual treatments that could be applied to them.

For example, imagine you were running a multivariate experiment on a sign up button that users click to sign up for a service. You might try changing the color of the sign up button, the size of the signup button, the position of the signup button on the page, and the text displayed on the signup button. Each of these items represents a different factor in your experiment.

For each factor, there can be multiple individual treatments that can be applied. Let’s take the color of the signup button as an example. You might want to test a red button, a blue button, and a green button. Each of these colors represents one level of the button color factor.

When it comes time to run the experiment, many different treatments are created by choosing one level from each factor and combining them to create one unique experience. The exact details of how the subjects are randomized into treatments will vary from one application to the next.

Diagram of how factor levels combine to create treatments in factorial experiments.

In some cases, subjects will be equally distributed across all treatments for the duration of the experiment. This is often called a a full factorial design. Full factorial designs are simple and easy to analyze, but they require very large sample sizes. In many cases, they require prohibitively large sample sizes.

In other cases, the experiment will be spread out over multiple sequential rounds of testing. In these situations, it is common to start out by selecting a subset of treatments and only randomizing subjects into those treatments for the first round of testing. Once that round is over, you can then look at which treatments performed well and select those treatments, or other treatments with similar properties to those treatments, for the next round of testing. This process often continues over multiple rounds to ensure that the treatments that look most promising are getting an adequate sample size.

How to analyze multivariate experiments?

How do you analyze multivariate experiments? Just as subjects are randomized into treatment groups and the individual level, the results of multivariate experiments are also analyzed at the subject level, with each subject contributing a unique observation to the result pool.

The most simple way that one might think to analyze the results of a multivariate experiment is to treat each unique combination of levels as a totally unique treatment. If a combination is treated as a unique treatment and the commonality between different combinations that have shared characteristics is ignored, the results can be analyzed in the same way that you would analyze a simple AB test that has more than two treatments.

That being said, this simple method has low power and will require very large sample sizes to reach statistical significance. Instead, most people consider the commonality between different combinations that have shared characteristics and pool information on treatments that have shared characteristics to help increase the power of any statistical analyses that are run.

Advantages and disadvantages of multivariate experiments

What are some of the main advantages and disadvantages of multivariate experiments? In this section, we will discuss some of the main advantages and disadvantages of multivariate experiments. This will provide more context that will flavor our discussion of when multivariate experiments should and should not be used.

Advantages of multivariate experiments

What are some of the main advantages of multivariate experiments? Here are some of the main advantages of multivariate experiments.

  • Iterate faster. One of the main advantages of multivariate experiments is that they enable users to iterate faster by testing multiple changes at once rather than testing multiple different changes one after the other. This enables users to optimize their products and processes faster and see the full value of the changes they are making sooner.
  • Understand interactions between different changes. Another advantage of multivariate experiments is that when you test multiple different changes in a single experiment, it makes it easier to understand the interactions between those different changes. Interactions happen when there is a special synergy between specific levels of one or more of the factors that are being tested. Due to this synergy, one combination of levels (or one multivariate treatment), might perform much better than you would have expected it to if you ran a simple AB test on each factor independently.
  • Commonly used. Another advantage of multivariate experiments is that they are fairly common. They may not be as common as simple AB tests, but they are broadly used across multiple industries. That means that your stakeholders are more likely to be comfortable with them and technical colleagues are more likely to be able to provide meaningful feedback.
  • Not easily disrupted by anomalous events. Another advantage of multivariate experiments is that they are not as likely to be impacted by anomalous events that happen during small intervals of time. This is because individual subjects will be randomly allocated into many different treatment groups during the anomalous event, so each treatment should be impacted about equally. This is not to say that these types of events do not have the potential to bias experimental results from multivariate experiments, but they do not have as large of an impact on multivariate experiments as they do in designs like staggered experiments and switchback experiments where the randomization scheme is heavily dependent on time.
  • Not complicated to run simultaneous experiments. Another advantage of multivariate experiments is that it is fairly straightforward to run simultaneous experiments on the same group of users, especially when the treatments shown in one are compatible with the treatments shown in the other (ex. one experiment does not make changes to a module that another experiment aims to remove). This is because individual subjects are randomized into treatment groups in a way that is not confounded by any other element like time, which means that each treatment should be equally affected by the presence of the other experiment. That being said, it is still possible that there may still be some cases where there are interactions between the two experiments that introduce bias.

Disadvantages of multivariate experiments

What are some disadvantages of multivariate experiments? Here are some disadvantages of multivariate experiments. Specifically, these are disadvantages that apply to the most common and straightforward implementations of multivariate experiments.

  • Not straightforward to handle situations where there are complex dependencies between subjects. One disadvantage of multivariate experiments is that it is not straightforward to handle situations where there are complex dependencies between subjects. This is especially true in cases where the behaviors of subjects that are allocated into one treatment have the potential to impact the behaviors of subjects that are allocated into another treatment group.
  • Requires a larger sample size. Another disadvantage of multivariate experiments is that they generally require a larger sample size than other methods like simple AB tests. The main reason for this is that there are many different treatments that are compared in multivariate tests.
  • More ambiguity in how to analyze results. Another disadvantage of multivariate experiments is that there are multiple choices that need to be made in order to determine how the results of your experiment should be analyzed. This means that more time and effort needs to be invested to determine what methods will be used to analyze the experiments.
  • Vulnerable to issues from multiple testing. Since there are so many treatments being compared to one another, multivariate experiments can be vulnerable to issues related to multiple testing. There are ways to adjust for this, but it is another consideration that needs to be taken when designing multivariate experiments.
  • Some designs cannot be fully specified ahead of time. Another possible disadvantage of multivariate experiments is that there are many designs for multivariate experiments where the full design of the experiment (ex. the exact details of what proportion of subjects will see each treatment at each time) cannot be fully specified ahead of time. This is because common implementations of multivariate experiments often operate in successive rounds where the exact distribution of treatments that is shown in one round is influenced by the results of the previous round. This can make it hard to spot issues in the experimental design or plan when to run other experiments that are incompatible with a few specific treatments in the current experiment. This is not true of all multivariate designs, but it is true of many common implementations of multivariate experiments.
  • Some designs can be biased when treatments have long lasting effects. Since multivariate experiments often operate in successive rounds of testing where subjects have the potential to be exposed to different treatments, these experiments can be biased in situations where treatments have long lasting effects on users. This is because a user might still be under the influence of the treatment that they saw in the first round of testing when they see a different treatment in the second round of testing. Again, this is a disadvantage that is not characteristic of all multivariate designs, but is characteristic of some of the most common designs.
  • User experience can be inconsistent across many rounds of experimentation. Similarly, multivariate experiments have the potential to cause an inconsistent user experience for users who are randomized into different treatments during different rounds of testing.

When to use multivariate experiments

When should you use multivariate experiments? Here are some examples of situations where you should generally reach for multivariate experiments.

  • When you have multiple related changes to the same process or module that you want to test. Multivariate experiments are a great option when you have multiple changes that you want to test to the same module, component, or process. Multivariate experiments allow you to test all of these changes at once and get results faster than you would have if you tested the changes one at a time.

When not to use multivariate experiments

When should you avoid using multivariate experiments? Here are some examples of situations where you may be better off avoiding multivariate experiments.

  • When there are complex dependencies or network effects between subjects. Most common implementations of multivariate experiments operate on the assumption that subjects in your experiment are relatively independent of one another. If you are operating in a case where this is not true because there are complex dependencies between subjects, then you might be better off looking into other experiment designs. For example, if you are running an experiment in a two sided marketplace or a setting with heavy network effects, then you should look into other experimental designs that are more well suited for these kinds of situations. Staggered experiments and switchback experiments are two examples of experimental designs that were created with these situations in mind.
  • When you have a very small sample size. If you are operating in a situation where sample size is limited, you might be better off using a simple AB test or looking into experiment designs that are specifically designed for situations where your sample size is small. You may need to simplify your experiment and reduce the number of treatments you will test in these scenarios.

Related articles

Other experimentation techniques


Share this article

Leave a Comment

Your email address will not be published. Required fields are marked *