We all have favorites…movies, shopping list, tv shows, clothes, etc. etc. etc. Ok, welcome! Today, I’ll show you how to create a favorites movie list in XML…why XML? Because from this format, I’ll do some amazing things like analytics. First on this post though, I’ll show you how to store the list in a structured form so the dumb computer can understand and parse, and then how you can modify it to many other forms for presentation and analytics…this post will simply focus on parsing the XML list and creating a human-friendly list that you can print out or post on your social media, web site, email, etc. In another following post, I’ll show you how it’s taken to another level of business intelligence analytics using PowerBI. More on that later. Let’s get back to the basics first.
Prep The List
I started with creating an extensive list of my favorite movies (which you won’t find in typical hollywood, hulu, netflix, etc.) that I’ve compiled for decades. Most indie. Anyway, that list could’ve been a simple text file but it’s not that efficient, because what I wanted to do is do analytics on that. So I chose to turn the list into an XML format (this is NOT the only choice, there are many other ways you should consider!). The format is shown below (only an excerpt)…
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="movies.xsl"?> <movies> <movie> <title>The Curious Case of Benjamin Button</title> <genre>drama </genre> <lang>english</lang> <rating>5 </rating> </movie> <movie> <title>Frozen River</title> <genre>crime drama </genre> <lang>english</lang> <rating>5 </rating> </movie> <movie> <title>Valkyrie</title> <genre>war drama </genre> <lang>english</lang> <rating>5 </rating> </movie> <movie> <title>Apollo 13</title> <genre>drama documentary</genre> <lang>english</lang> <rating>5 </rating> </movie> <movie> <title>Melvin Goes to Dinner </title> <genre>comedy</genre> <lang>english</lang> <rating>5 </rating> </movie> <movie> <title>Baghead</title> <genre>indie comedy</genre> <lang>english</lang> <rating>4</rating> </movie> <movie> <title>Babel</title> <genre>political drama</genre> <lang>english</lang> <rating>4</rating> </movie> </movies>
As you see, I wanted to get some basic info associated with a movie, such as genre, language, and my own rating (say, 1 to 4 stars). This information is stored as simple text but in a structured way…in XML in this case. After that, sky is the limit because if I can parse this (which I can programmatically because it’s structured!) I can transform the display in infinite ways! 4 stars can be replaced with start images, or tomatos 🙂
You can download my full movies list from here by the way in XML format.
Okay, save this short list as a simple text file. I called it movies.xml (you can call it whatever, and extension doesn’t matter right now). So, hereon we can parse it via any language and present it in any way we want. In this post, we will parse it using Python, and then create a printable list that you can laminate/frame.
Explanation Of Code
I commented in the code preceded by “#” so you can understand what the following statement is doing. DOM is dynamic object model I think 🙂 If you understand (if you don’t please look it up and learn) object oriented programming, it’s natural. So, DOM has its children and siblings. Much like a binary tree…look it up if you don’t what it is. It’s a vital concept in computer science. The code with comments below should be self-explanatory if not, that means you can start learning, which is great! Anyway, we create a DOM tree and traverse it, print it as needed. We create a local file and write to it. And good practice involves try/catch block as in C/C++ but in Python, it’s try/except (again, remember what I said, syntax are for more rookies! Computer scientists understand beyond stupid syntax nuances made by men and always changing).
I will next use the same xml source for my favorite movies and present in amazing ways visually from this 🙂 Here’s the next step blog!
Please comment and optimize my code below.
Full Source Code (Python)
Assuming the list is saved as a text file name movies.xml