Tuesday, October 15, 2024
STEM

How to calculate streaks in Excel

Streaks refer to trends in the data. These can be linear, exponential, damped, seasonal, irregular/random, stationary, or cyclical.

Streaks are important for several reasons:

  1. Forecasting and Planning: Identifying trends and patterns in data can help businesses forecast future behavior and plan accordingly.
  2. Testing Theories and Strategies: Trends can be used to test business theories and strategies.
  3. Decision-Making: Data trends can inform decision-making processes.
  4. Problem Solving: Recognizing patterns in data can aid in problem-solving.
  5. Understanding Customers: Data trends can provide insights into customer behavior.

There isn’t any built-in function in Excel for calculating streaks, but there different ways we can make Excel do some of that heavy-lifting by using a combination of functions such as COUNTA and MATCH and INDEX together; or FREQUENCY, ROW, and MAX together; or MAX, SCAN, and LAMBA functions together.

In addition to these, I’ll show you another method that requires setting up a helper column but uses relatively easier steps that’s also easy to comprehend.

First, let me share with you the other formula methods without using any helper columns:

  1. Using COUNTA and MATCH functions:
    This method is used to calculate the current streak. The formula is:
    =COUNTA(Table1[W/L])-MATCH(1,INDEX(1/(Table1[W/L]="Loss"),0)) This formula calculates the number of games since the last loss, which is the current winning streak.
  2. Using FREQUENCY function:
    This method is used to calculate the longest streak. The formula is: =MAX(FREQUENCY(IF(Table1[W/L]="Win",ROW(Table1[W/L])), IF(Table1[W/L]<>"Win",ROW(Table1[W/L])))) This formula calculates the longest winning streak in the data. Note that this is an array formula, so you need to press Ctrl + Shift + Enter after typing it.
  3. Using SCAN function:
    This method is also used to calculate the longest streak. The formula is:
    =MAX(SCAN(0,C5:C22, LAMBDA(a, v, IF(v = "w", a + 1,0)))) This formula calculates the longest streak of consecutive wins. At present, SCAN function is available in most recent version of Excel.

Replace ‘Table1[W/L]’ and ‘C5:C22’ with your actual data range. Also, replace ‘”Win”‘ and ‘”Loss”‘ with the values you’re using to represent wins and losses in your data (e.g. 0/1, W/L, etc.).

Now, let me give share some real data and show you how I can calculate the streaks in different scenarios using my favored method.

Let’s say you have a win-loss data of a team over several matches. The data looks like this:

Of course, dealing with just one team and a small dataset is easy just by eye-balling and manually counting the numbers of consecutive wins and losses and spitting out the result. But that’s not good enough in my view. Not only is it not scalable, it is manual meaning mistakes can happen even with the simplest data. We want to a setup where we utilize Excel to ensure that no matter how large or different the datasets are, it’ll always gives accurate results. For that to happen, we need to a method that we set up once and then it just works at any scale, under different situations. Below are the steps.

Steps:

  1. Insert a new blank row above the first row of data.
  2. Add a helper column (e.g. “HelperCol”) and enter digit 0 for its first row’s value.
  3. In the 2nd row of the helper column, enter the formula: =IF(C114=”W”,D113+1,0) where D113 is 0 (at start) and C114 is the first row of data.
  4. Drag/copy the formula down to cover all the rows of dataset.

SOLUTION:
The way it works is that the formula keeps adding the wins with the previous cell’s tally only if the result is a win for that row. Otherwise, it sets it to 0. That way, we have counts of consecutive wins in the helper col. Next, all we have to do is get the max value out of that to find the longest streak by MAX(HelperColumn range). To find the losing streak, just change the “W” to “L” in IF().

The data layout looks like this along with the answer for longest winning streak:

Similarly, the data layout looks like this along with the answer for longest losing streak:

You can present the findings any way you wish. For example, I can create a quick McKinsey-style chart as this (I explain creating this type of chart in another post in detail: McKinsey-Style Bar Chart In Excel):

This wasn’t hard at all. But next, let’s look at a much larger dataset and on a different topic. I have 30-year’s worth of daily weather data in Seattle out of which I’m interested in finding a streak of rainy days vs. non-rainy days in a particular year. We all know “it always rains in Seattle”, right? I’ll use 2016 actual daily data for that entire year. My dataset looks like this:

The RAIN column is 1 when it was a rainy day, otherwise it’s 0. TMAX and TMIN refer to daily high and daily low temperatures respectively in Fahrenheit.

As before, we insert a blank row above the dataset, then insert a helper column, and follow the steps as described above. The layout looks like as below (with formulas shown in dark red per row for your convenience):

We can find the longest streak of rainy days easily by just MAX(E36:E402), and we get 18. Similarly, we get non-rainy days by MAX(J36:J402) which yields 23.

I also wanted to know the streak of “hot” days, meaning, how many consecutive days (not total) where there in that year when high temperature was above 79 degrees Fahrenheit. That’s what H column is for in the above layout. With all the sorted out, we can create a dynamic report or text that grabs all the information from calcuated cells above. For example:

The above text were automatically generated by using CONCAT and date functions. For example: =CONCAT(UNICHAR(127783)," The longest streak for rainy days were for ", B407, " days, from ", MONTH(D412), "/", DAY(D412), "/", YEAR(D412), " to ", MONTH(D410), "/", DAY(D410), "/", YEAR(D410)) which uses the unicode symbol for cloud-and-rain and parses the date into month,day, year parts so we can format the date is any way we want for the presentation.

So, how did I get the starting and ending dates of a streak? That’s easy with a little thinking…by the arrangement of our formulas, we know the highest streak number will be the last row, for which if we extract that’s row’s DATE column’s value, we get the ending date. You can do this using any lookup formula (I used XLOOKUP on B407 which contained the longest streak of 18 rainy days to get the ending date of longest streak for rainy days). So, all we have to do is get the starting date, which is of course the ending date of streak minus the streak days value we calculated such as 18 in B407 plus 1. And we have all the information we need! Now you can present the information in various ways as per your need.

I hope this was interesting and educational. Happy charting!


Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top
+