datascience – Page 5 – Musings by FlyingSalmon

STEM

How to pick rows or columns, and combine them with other functions for powerful uses

July 22, 2024July 23, 2024Leave a Comment

In this post, I’ll share an Excel function that enables us to easily pick specific number of rows and/or columns from a range of data. Additionally, I’ll share examples of how it can be combined with other functions to do some powerful tasks. The function I am discussing here is: TAKE(). In the next blog, […]

STEM

Euro 2024 Analyzed with Visuals

July 18, 2024July 18, 2024Leave a Comment

Introduction With the summer of soccer, beautiful football having come to an end, let’s dive into some team and player statistics and assay the tournament. I have collected various stats and facts, and organized them into a digestible format as a tribute to the fantastic players and the tournament. In this post, I’ll cover some […]

STEM

Division, Floor Division (Python)

July 5, 2024July 5, 2024Leave a Comment

In Python, the single forward slash ‘/’ performs a floating-point division unless you’re using Python 2.x,in which case it performs an integer division with integer operands. The double forward slash //, on the other hand, forces a floor division operator which performs an integer division and returnsthe largest integer less than or equal to the […]

STEM

GENETIC QUIRKS OF THE WORLD 2024

July 4, 2024July 4, 2024Leave a Comment

Here are some interesting facts (oddities?) from around the world and their associated visuals in Excel. Data source: worlpopulationreview.com Data rounded to nearest single decimal digit when applicable.

STEM

Generate Bar Codes, QR Codes in Excel: Quick & Easy Way

July 1, 2024July 1, 2024Leave a Comment

In this post, I’ll show you a quick, easy, and a free method that you can use today to generate QR codes, bar codes in UPC-A, UPC-E formats, and custom bar codes based on the product information you enter in Excel as text. How it works Two things are at play here that make it […]

STEM

How to calculate streaks in Excel

June 30, 2024June 30, 2024Leave a Comment

Streaks refer to trends in the data. These can be linear, exponential, damped, seasonal, irregular/random, stationary, or cyclical. Streaks are important for several reasons: There isn’t any built-in function in Excel for calculating streaks, but there different ways we can make Excel do some of that heavy-lifting by using a combination of functions such as […]

STEM

Data cleansing challenge: non-ASCII characters

June 25, 2024June 25, 2024Leave a Comment

Non-ASCII characters can pose challenges in data cleansing for several reasons: Therefore, it’s a good practice to standardize or normalize text data to ASCII when possible, or ensure correct handling of non-ASCII characters. This helps to maintain data integrity and simplifies subsequent data processing tasks. Superscripts, subscripts, or “special” characters often look like ascii characters […]

STEM

Comparing and merging lists in Excel, Python

June 24, 2024June 24, 2024Leave a Comment

Identifying anomalies, duplicates, and updating data necessitates comparing information from various sources. Accurate execution of these tasks is crucial, whether working solely with spreadsheets or using a mix of tools and languages like databases and web services. In this post, I will demonstrate various methods for comparing lists of identical or differing sizes across different […]

STEM

How much does it cost to retire in each state?

June 23, 2024June 23, 2024Leave a Comment

Recently, I collected data on cost of living, and average longevity in every state + D.C. From the data, I derived the COLI, or Cost of Living Index, which is then normalized to 100 (where 100 represents the national average cost of living). Additionally, using data from Bureau of Labor Statistics (BLS), I populated by […]

STEM

Data Normalization & Rescaling

June 15, 2024June 15, 2024Leave a Comment

Normalizing data is a common task in many applications, especially when working with large datasets, machine learning, or statistical analysis. There are two common statistical methods for normalization: Min-Max Scaling, Standardization or Z-score Normalization. But there are other ways too, which I will demonstrate in the examples below. 1. Min-Max Scaling (Normalizes Data to Between […]