A violin plot is a visual that traditionally combines a box plot and a kernel density plot. This is typically created in R, Python languages using MATPLOTLIB and other applications/modules. However, I show here how to get it done using only PowerBI. About Violin Plots Before I go into how to plot it, let’s understand […]
Tag: statistics
Understanding buckets, bins, categorization (3/3)
In this multi-series of blog, I’ll touch on different ways to categorize data in buckets, or bins, and summarize in meaningful ways. Some will use Pivot Tables, some will not. But we’ll cover 3 common scenarios. Let’s do the third and final one here… The Scenario We have about 2,000 rows of data on sales […]
Understanding buckets, bins, categorization (2/3)
In this multi-series of blog, I’ll touch on different ways to categorize data in buckets, or bins, and summarize in meaningful ways. Some will use Pivot Tables, some will not. But we’ll cover 3 common scenarios. Let’s do the second one here… The Scenario We have data on employees’ join-dates (TABLE 1). Objective: We want […]
Understanding buckets, bins, categorization (1/3)
In this multi-series of blog, I’ll touch on different ways to categorize data in buckets, or bins, and summarize in meaningful ways. Some will use Pivot Tables, some will not. But we’ll cover 3 common scenarios. Let’s start with the first… The Scenario We have a 120K+ rows of data on wine bottle prices and […]
Forecasting with Seasonality & Linear (Time-Series)
In one of my earlier blogs, I used linear forecasting method to predict a team’s scores in the future. While that was darn close to reality (when we checked with actual results), it did not mathematically account for fluctuations or seasonality and the outcome was always linear. In this blog, I show another method (that […]
Quick! Pick a number between 1 and 10!
“Pick a number between 1 and 10”—There’s been plethora of variations of such a game manifested in ways of magic tricks, simulations, statistics, and for sheer intellectual and nerdy curiosity. But is there an answer? Better yet, is there a pattern? If so, what is it…and how can it be explained? Some experiments claim to […]
Simulating coin-toss in Excel
Most of us know that theoretically, there’s a 50-50 chance of getting a “head” from an unbiased coin-toss. There are numerous implementations and simulations done in virtually all programming languages around this age-old ‘riddle’. In this blog, I share a simple but very effective simulation in Excel. Here’s the simulation: As you can see, I’m […]
A “Dicey” Experiment
You roll a pair of dice. My task is to predict which numbers will roll (total from both dice). How would I go about it? It’s obviously a probability problem. I want to find the probability of each roll for all possible outcomes. How would I arrange and solve it in Excel? Answer: Quite easily […]
Compare and understand: Spread and Consistency
Imagine you have a product line with an average price of $20, and another product line where the average is $100. Which has more spread? Imagine you have 3 players whose bowling scores you have (which are all over the place by the way) and you want to know which player is more consistent? Or, […]
Sample Size in Python
This is part of a 3-part series on the topic. Please read the posts in the order for maximum clarity and context: 1. Sample Size and Margin of Errors. Everything you need to know and ++ 2. Sample Size (Contd.) 3. Sample size in Python (This one) Ok, after reading the first 2 posts, you […]