Thursday, September 11, 2025
STEM

Data cleansing challenge: non-ASCII characters

Non-ASCII characters can pose challenges in data cleansing for several reasons: Therefore, it’s a good practice to standardize or normalize text data to ASCII when possible, or ensure correct handling of non-ASCII characters. This helps to maintain data integrity and simplifies subsequent data processing tasks. Superscripts, subscripts, or “special” characters often look like ascii characters […]

Read More
STEM

Comparing and merging lists in Excel, Python

Identifying anomalies, duplicates, and updating data necessitates comparing information from various sources. Accurate execution of these tasks is crucial, whether working solely with spreadsheets or using a mix of tools and languages like databases and web services. In this post, I will demonstrate various methods for comparing lists of identical or differing sizes across different […]

Read More
STEM

Find Superscripts, Subscripts, and Unicode in a text file (Python)

Occasionally, it becomes necessary to search for special characters like superscripts, subscripts, symbols, emojis, or any Unicode characters within a text document. This is crucial when working with data files that should not contain any such characters, unless they are explicitly required and managed. Most editors, including Word, lack a ‘Find’ feature that reveals all […]

Read More
Coding STEM

Windows 11 Boot MicroAnimation and Wise Monkey using just text

Recently, an ex-colleague of mine brought to our attention the Microsoft font possibly used in creating the spinning dots in Windows 11 boot screen. The font is called “Segoe Boot Semilight” and is a true type font. So, I had to verify that this is true, and I wanted to try to create the animation […]

Read More
STEM

Different ways to generate a list and array containing numbers

In this post, I show different ways to generate a list and array containing numbers (ints, and floats, positive and negative, mixed) in python…using random, list, and numpy arrays. Why do we need both? NumPy arrays are very efficient for numerical operations and require elements to be of the same data type, whereas lists are […]

Read More
STEM

Finding min, max in a sequence and plotting the distances

Let’s say that we have a sequence of numbers in consisting of whole numbers, fractions, positive and negative numbers of varying length, that is, of no predefined length (e.g.2.25, 5.25, 6.75, 8.25, 9.75…). What we want to do is to sort the numbers in ascending order, and find the differences between 2 adjacent numbers and […]

Read More
Back To Top