When to use interrupted time series

Share this article

Are you wondering whether you should use interrupted time series for your next data science project? Or maybe you want to learn more about the advantages and disadvantages of interrupted time series analyses? Well either way, you are in the right place! In this article, we tell you everything you need to know to understand when you should use an interrupted time series analysis.

We will start out by discussing how to design and conduct interrupted time series analyses. After that, we discuss some of the main advantages and disadvantages of interrupted time series analyses. This will provide more context to help you understand when you should and should not use interrupted time series analyses. After that, we will discuss some situations where it is appropriate to use interrupted time series analyses. Finally, we will discuss some situations where you should avoid using interrupted time series analyses.

How to design interrupted time series analyses

We will start out by describing how interrupted time series analyses are designed. At the most basic level, interrupted time series methods are used when you have time series data that shows what your primary outcome looked like before and after you launched a treatment to your population(s). Data from the period before the treatment and after the treatment period is then compared to help assess whether the treatment had an impact.

There are a few different setups that are commonly used in an interrupted time series analysis, so we will discuss how each of these types of analyses are set up. There are two main factors that these analyses differ on. The first is whether there is a control available at all. The second is whether the underlying data was generated via a randomized process where the experimenter decided which subjects would get which treatment or whether the data was generated via a natural process where subjects self-selected into treatment groups. This second factor is only relevant to situations where there is a separate control population available.

All of these setups use approximately the same statistical methods to analyze the data, which involves using time series methods to forecast what the primary outcome for the treatment group would have looked like after the treatment date if no treatment had been applied. That being said, there are large differences in the strength of assumptions that need to be made. Some of these setups require you to make fairly strong assumptions about the processes that generate the underlying data, whereas others rely weakly on a small set of assumptions that are easier to defend.

Interrupted time series with and without controls

As we mentioned before, there are two different setups that can be used for interrupted time series. One of these setups makes use of a control group that is not impacted by the treatment. The other setup does not use any control group at all. In this section, we will outline the differences between these setups.

Interrupted time series analysis without a control

The first option you have is to run your analysis without any control group. In this setup, the only data that you have to work with is a time series of your primary outcome in your treatment group. When you are in this situation, you can simply use time series methods to forecast what the primary outcome for your treatment group would look like after the treatment date based on the historical data before the treatment date. After that, you can compare the forecast that you created to the actual trends that were observed for the primary outcome in the treatment group.

The main advantage of this setup is that it can be leveraged even in scenarios where you do not have any type of control data. The main disadvantage of this method is that there is no way to control for other natural or artificial variations that occurred after the treatment date that were not related to your treatment. This makes it hard to have full confidence that any change that you detect between the forecasted time series and the actual time series is a result of your treatment and not some other source of variation. That means that you need to make very strong assumptions relating to your treatment being the only possible source of variation in order to attribute any credit for the changes seen to your treatment.

Interrupted time series without a control.

Interrupted time series analysis with a control

If you want to run a more rigorous analysis and have a stronger case for attributing causality to your treatment, then you are best off running an interrupted time series analysis with a control. In order to run an interrupted time series with a control, you need to choose a control time series that would not be affected by your treatment but would capture any natural or artificial variation that is not related to your time series. You can then use the post-treatment values of the control time series as an input to the forecasting model that you use to forecast the post-treatment values of the primary outcome for your treatment group. This helps to incorporate any variation that is not related to your treatment into your forecast.

Depending on the set up of your experiment, there are a few different types of time series that you can use at your control time series. If you have a control population that did not receive the treatment, then it is a common choice to measure the primary outcome on the control population as well and use that as the control time series. If you do not have a separate control population, it is often possible to choose a different outcome that is measured on the treatment population but would not be impacted by the treatment itself as a control time series.

Interrupted time series with a control.

Natural interrupted time series and randomized interrupted time series

Another factor to consider in your interrupted time series analysis is whether you are running a natural interrupted time series analysis or a randomized interrupted time series analysis. A natural interrupted time series analysis is an analysis where you have observational data from a naturally occurring treatment that you do not have control over. In this situation, you are not able to decide which subjects are exposed to treatment. In a randomized interrupted time series, you do have control over who is exposed to the treatment and are able to set up a proper randomized experiment.

In both of these situations, we are assuming that you are measuring outcomes on a control population and a treatment population. If you are using a different setup where you do not have a control population, then this section is not relevant to you.

Randomized interrupted time series

If you are running a randomized interrupted time series analysis, then you are in a favorable situation where you do not have to make as strong assumptions in order to attribute causal impact to your treatment. This is because you can design a randomized experiment where subjects (or groups of subjects) are randomly allocated between the control and treatment group.

This can give you confidence that there are not fundamental differences between the demographic characteristics of your control and treatment groups that could confound your results. This makes your analysis much more straightforward because you do not have to resample your data or otherwise make adjustments to your data in order to ensure that the control time series and the treatment time series are measured on populations that are similar in nature.

Natural interrupted time series

If you are running a natural interrupted time series analysis where you are studying a naturally occurring treatment that you did not impose on the population, then you need to take more care. Specifically, you need to take care to ensure that the control population is comparable to the treatment population. You need to ensure that there are no confounding variables that affect both the treatment group that a subject is allocated into and the outcome that is measured on that subject.

When you are operating in this scenario, you may need to resample or adjust your dataset to create samples that are more similar in nature. Any statements that attribute causality to your treatment will rely strongly on the assumption that your control sample and treatment sample are balanced over any confounding variables.

How to analyze interrupted time series analyses

When you are analyzing data from an interrupted time series analysis, there are a variety of different methods that you can use. You just need to ensure that you are using a valid time series model that fits the constraints of your dataset. For example, if you are using a control time series then you must use a time series model that takes covariate information. Google’s CausalImpact library is a common choice for those who are analyzing interrupted time series data.

Advantages and disadvantages of interrupted time series

What are some of the main advantages and disadvantages of using interrupted time series methods as opposed to other causal inference and experimentation methods? In this section, we will discuss some of the main advantages and disadvantages of interrupted time series.

Advantages of interrupted time series

What are the main advantages of interrupted time series analyses? Here are some of the main advantages of interrupted time series analyses.

  • Do not need a proper control time series. One of the main advantages of interrupted time series analyses is that you do not technically need a proper control time series to compare your treatment time series to. While conclusions that are made without a control time series may be flimsy and rely on strong assumptions, there are not many other methods that enable you to do any kind of rigorous analysis in this situation.
  • Control time series can be measured on the same population as treatment time series. Another large advantage of interrupted time series methods is that they enable you to use outcomes that were measured on the population that received the treatment as control. This means that you can run an analysis with some sort of control even in a situation where your entire population was exposed to a treatment at the same time. This is very rare.
  • Can be conducted on a smaller sample size. Another advantage of interrupted time series analyses is that they do not require as large of a sample size as other options. This is because the data is analyzed in aggregate rather than being analyzed at the individual level. That being said, your analysis will still be more robust if you use a larger sample size.
  • Can be used without randomization. Another advantage of interrupted time series methods is that they can be used to attribute causation even in situations where you were not able to randomize individuals into treatment groups.
  • Produce compelling visual results. Another advantage of interrupted time series analyses is that they can produce compelling visual results. Plots that show the actual post-treatment time series alongside the predicted post-treatment time series make intuitive sense even for those who do not have a strong data background and can help to demonstrate the impact of a change. This can make it easier to get buy-in from non-technical stakeholders.

Disadvantages of interrupted time series

And what are some of the main disadvantages of interrupted time series analyses? Here are some of the main disadvantages of interrupted time series analyses.

  • Need regular time series data. One of the main disadvantages of interrupted time series analyses is that they require you to have access to time series data. If you have data that is already in time series format or can be aggregated into time series format, then this is no problem. That being said, this is not always the case.
  • Sometimes rely on strong assumptions. Another disadvantage of interrupted time series analyses is that they rely on strong assumptions that are not always easy to validate. This is true of conclusions that are made without a randomized control population and especially conclusions that are made without a control time series at all.
  • Easily disrupted by anomalous events. Another disadvantage of interrupted time series analyses is that they are easily disrupted by anomalous events that happen at a specific point in time. This is especially true if this event happens during the post-treatment time period. If a large, unexpected change that you did not design your analysis to be robust to happens during this time period then it can be difficult to analyze the results.
  • Not necessarily straightforward to design. Another disadvantage of interrupted time series methods is that they are not necessarily straightforward to design. There are a few different paradigms that can be used and a lot of design decisions that need to be made. This means that you need to put a lot of focus and energy into ensuring that appropriate design decisions are made.
  • Not necessarily straightforward to analyze. Similarly, interrupted time series data is not always easy to analyze. This is particularly true if you are analyzing data from natural experiments and you suspect that there may be confounding variables that affect both the likelihood of treatment and the primary outcome variable.
  • Not well understood in all circles. Finally, interrupted time series methods are not well understood in all circles. This means that you may have to cast a wider net in order to find someone who can give you informed feedback on your analysis.

When to use interrupted time series

When should you use an interrupted time series analysis? In this section, we will discuss examples of situations where interrupted time series methods are a good choice.

  • When your treatment was rolled out to the entire population. The most common situation that stands out as a great candidate for interrupted time series analysis is when you are in a situation where your entire treatment was rolled out to the entire population at once. There are not many other causal inference methods that can accommodate this type of situation with no control population that you can compare your treatment population to. Interrupted time series analyses allow you to use other outcomes that were measured on the treatment population as a control that allows you to adjust for variation that is not related to your treatment.
  • When your sample size is artificially low because you cannot randomize at the individual level. Sometimes people find themselves in situations where there are many individuals that they can take observations on, but the effective sample size for the experiment is small because randomization cannot occur at the individual level. For example, if you are optimizing a website for search engines that can only index one version of a web page at a time, you cannot randomize at the individual user level. You can only randomize at the page level. While there will be many individual users included in your experiment, your effective sample size relies on the number of pages that you have (which is often much smaller). This is a common situation that growth teams find themselves in when they are analyzing SEO experiments. Interrupted time series analyses are generally a good option to turn to in these situations because they do not require as large sample sizes.

When not to use interrupted time series

When should you avoid using interrupted time series? Here are some examples of situations where you should avoid using interrupted time series analyses.

  • When an AB test would suffice. It is generally best to use the most simple analysis method that is appropriate for your dataset. If you are in a situation where you can neatly randomize individual subjects into different treatment groups and a standard AB test would suit your needs, then you are best off using a simple AB test.

Related articles


Share this article

Leave a Comment

Your email address will not be published. Required fields are marked *