Analytics Education STEM

A Horse-race Simulation with a “Strip Plot”

Strip plots, or a dot plot horizontally spread out are very useful in certain cases. I’d use it to show spread or concentration of data points across multiple categories. For example, in order to visualize between brand A, and B, what are their product sales (or prices) and where they appear to focus. What the outliers are, etc.

In this blog, I use the same method except for something a bit more entertaining. We’ll have a 2-horse race (to add more horses, repeat the method as described with their own data points).

How it Works

This is implemented in Excel Workbook. For each race, we start the race by pressing F9. Things to note:

  • They have the same distance to cover, so the total number of data points should be same.
  • Starting points are same to be fair.
  • The additional points (distance) is added to previous…running tally.
  • The final number is random (that’s the differentiator!): =RANDBETWEEN(MAX(A15:A23), MAX(A15:A23)+$A$25) assuming col A has horse A’s data and =RANDBETWEEN(MAX(C15:C23), MAX(C15:C23)+$C$25) assuming col C has horse B’s data.

I also added a Stride parameter for each horse. In this example they have the same stride and obviously longer stride will win by formula but it’s there for my additional gamification (not shown here).

The final display of the winner is with IFS function (i.e. =IFS(A25>C25,A15,A25<C25,C15,A25=C25,”Tie!”)) and applied a conditional formatting formula to the cell to match the winner’s color as in the chart.

Here’s a clip of actual outputs for a few races:

Setup

We’ll have a Horse A, and a Horse B (you can name yours more creatively!).

Arrange the data for each horse’s column so the data points increment by stride for the horse (e.g. 100).
Then at the last data point, find a number between min and max of the column + stride. Obviously longer stride will win by formula but it’s there for fun/experiment.

But with same stride, the final number is random: =RANDBETWEEN(MAX(A15:A23), MAX(A15:A23)+$A$25) and =RANDBETWEEN(MAX(C15:C23), MAX(C15:C23)+$C$25)

A normal XY scatter or bubble chart are not useful here. We want them on same Y axis but different X axes, so we create 2 new columns (1 for each Horse) which will work as their own data series. As long they’re unique, they’ll appear in different “lanes” i.e. along y-axis.

To add the horse images, format shape for each horse’s last data point: Fill & Line>Markers: increase the marker size where your selected image looks right. I just used Excel’s online picture search for: Horse

Press F9 for each game (refresh).


This post is not meant to be a step-by-step, detailed tutorial, instead to serve as an overview of the key methods/steps one needs to know to find similar solutions and/or evoke curiosity. Basic->Intermediate knowledge is assumed.
If you like more info or source file to learn/use, or need more basic assistance, you may contact me at trseattle at outlook dot com. To support this voluntary effort, you can donate via the Home page.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top