3/20/2020 Update This includes the latest data (as of 3/19/20) published by WHO. I’ll add more summary and other statistics on my blog and will share those info if I get the time (no promises). To see my earlier blogs/stats on this COVID virus, click here. This is NOT looking good for USA! The # […]
Tag: statistics
COVID-19 Update 3-19-20
3/19/2020 Sharing some analytics and summary of the spread of the nasty virus. This includes the latest data (as of 3/18/20) published by WHO. I’ll add more summary and other statistics on my blog and will share those info as I get to them. Here’s a quick summary as of 3/19/2020 using all data in my […]
Why Pareto? Using 80/20 rule in the real-world.
The Pareto principle states that about 80% of the effects come from 20% of the causes in many real-world events. You can use this to, say, identify 80% of your best-selling categories, or conversely the worst performing categories/items. The uses are limitless, really. And they don’t have to be only about $$$. Graphically, it’s a […]
Virus Wiping Away Toilet Papers in 2020!
Amazed (at the stupidity) and fascinated (by idiocy) by binge and panic shopper, I was compelled to ponder specifically about the toilet paper shortage in light of the recent pandemic (“corona virus”). This blog is will reflect on the reasoning provided by shoppers, and psychologists, a look at the economic statistics and realities of the […]
Creating Beautiful Violin Plots
A violin plot is a visual that traditionally combines a box plot and a kernel density plot. This is typically created in R, Python languages using MATPLOTLIB and other applications/modules. However, I show here how to get it done using only PowerBI. About Violin Plots Before I go into how to plot it, let’s understand […]
Understanding buckets, bins, categorization (3/3)
In this multi-series of blog, I’ll touch on different ways to categorize data in buckets, or bins, and summarize in meaningful ways. Some will use Pivot Tables, some will not. But we’ll cover 3 common scenarios. Let’s do the third and final one here… The Scenario We have about 2,000 rows of data on sales […]
Understanding buckets, bins, categorization (2/3)
In this multi-series of blog, I’ll touch on different ways to categorize data in buckets, or bins, and summarize in meaningful ways. Some will use Pivot Tables, some will not. But we’ll cover 3 common scenarios. Let’s do the second one here… The Scenario We have data on employees’ join-dates (TABLE 1). Objective: We want […]
Understanding buckets, bins, categorization (1/3)
In this multi-series of blog, I’ll touch on different ways to categorize data in buckets, or bins, and summarize in meaningful ways. Some will use Pivot Tables, some will not. But we’ll cover 3 common scenarios. Let’s start with the first… The Scenario We have a 120K+ rows of data on wine bottle prices and […]
Forecasting with Seasonality & Linear (Time-Series)
In one of my earlier blogs, I used linear forecasting method to predict a team’s scores in the future. While that was darn close to reality (when we checked with actual results), it did not mathematically account for fluctuations or seasonality and the outcome was always linear. In this blog, I show another method (that […]
Quick! Pick a number between 1 and 10!
“Pick a number between 1 and 10”—There’s been plethora of variations of such a game manifested in ways of magic tricks, simulations, statistics, and for sheer intellectual and nerdy curiosity. But is there an answer? Better yet, is there a pattern? If so, what is it…and how can it be explained? Some experiments claim to […]