The scenario: You have a dataset with values in one column that’s unique (or semi-unique) and another column with its associated values as Key:Value pairs but that column has repeating values. Imagine a list of members whose names are in one column and their corresponding membership status next to it as below: What we […]
Tag: education
How many real days…net? (Excel)
This is one of the simple and yet powerful tips when working with Dates in Excel. To calculate the number of workdays between two dates BUT EXCLUDING weekends (depending on your locale settings), we can use NETWORKDAYS(). In US and most western countries, Saturdays and Sundays would be excluded from the calculation. Without changing your […]
Phone number formatting tips (Excel and Python)
For most people working with data for a significant amount of time, formatting raw data from different sources are both a reality and a pain. Today, I show different ways to format an un-formatted string that contains phone numbers using Python, and Excel. Our task is to format it (either for human-reporting, or consumption by […]
Searching online (Google) with Python
In this blog, I demonstrate how to run a Google query from a Python app extremely easily. The harder part is just the set up and knowing what to install. So, let’s start with the set up details: You’ll need to install the following packages IN ORDER: 1) beautifulsoup4 2) google NOTE: The package to […]
Sample Size in Python
This is part of a 3-part series on the topic. Please read the posts in the order for maximum clarity and context: 1. Sample Size and Margin of Errors. Everything you need to know and ++ 2. Sample Size (Contd.) 3. Sample size in Python (This one) Ok, after reading the first 2 posts, you […]
Sample Size (Contd.)
This is part of a 3-part series on the topic. Please read the posts in the order for maximum clarity and context: 1. Sample Size and Margin of Errors. Everything you need to know and ++ 2. Sample Size (Contd.) 3. Sample Size in Python In this blog, we’ll use actual numbers to determine […]
Sample Size and Margin of Errors. Everything you need to know and ++
I’m not a statistician by profession or training. However, I find it fascinating with even the basics under my belt and find plethora of statistic’s practical usage. Without it, we’re really ignorant. With it, we’re equipped but not always best educated either. I heard the phrase again and again, “Correlation does not equal causation!” and […]
Need a baby name? Or just love data?
In either case, read on. I collected a dataset of poplular baby names from OpenData government site of City of New York ranging from 2011 through 2016…exactly 19,418 records. Original dataset view: What I want to do is find out: a) Most popular names b) Less popular names (or rare names used) c) Slice it […]
“Rabid Racoon” tracking (Python, using our own class)- Part 2
This is part 2 of the original series on Rabid Racoon. Please read Part 1 first to follow along. So, this time we’ll use a real-time graphing method using Python’s Turtle library instead of scatter plot as we did in Part 1. The beautiful part is, since we already created our custom Class in Part […]
“Rabid Racoon” tracking (Python, using our own class)- Part 1
Just like most high-level programming languages, Python also supports the idea of classes. Once the class(es) is/are created they can be inside the main python file or separated (in most usage scenarios), along with it methods. For Python-specific syntax, please refer to its documentation online or offline (python.org is a good start), but you should […]