STEM

Getting Data From A Website Without A Developer API Key

It started with a simple quest: I wanted to look up a word and get its example usage in sentences, along with its meaning without having to search each time. I wanted to just type in a word, and get its basic meaning, and a few example sentences using my own app or script. Yes, I could make API calls to get the info using my custom app, but if the target doesn’t offer any API or is costly? In this post, I explain how to accomplish this for free.

The method I’ll use is webscraping. Essentially, it means retrieving a web page over the internet programmatically, sifting through its content, extracting only the parts what I’m interested in, parsing it, and presenting it the way I desire. If a page on a website is publicly visible (without requiring signing in), it can be generally accessed and scraped. I wanted a quick and almost-instant way to get a word’s meaning, and its example usages in sample sentences just by running a script.

A few caveats with this approach should be noted. In order to do scraping , you’ll have to know the specific strucuture of the page (document), so that your code can navigate and retrieve the desired information. If the structure on the page changes, you’ll have to re-inspect and adjust the code accordingly. You’ll have to parse the information accordingly for presentation of the result. Not all websites allow scraping. Whereas, with an API approach the site explicitly publishes their API (sometimes free to a point, sometimes at a fixed ongoing cost, sometimes based on usage), you register with them and get a developer key that you use in your code to have programmatic access to their endpoints and therefore, data, is stable, predictable, documented, and generally well-supported. However, it generally costs you something. In this post however, we’ll be looking at solution without requiring a developer key or API. My solution, in this case, is to write a Python script that’ll leverage libraries such as requests, and the bs4 package (which is the BeautifulSoup library).

Steps

I found a dictionary web site that has a large collection of words along with their definitions and example usages. It’s a free, public web site at yourdictionary dot com. I’d like to get the information from its specific pages based on the word I’m interested in.

The first step then is to explore the site navigation. Just using it, I see that the page I land on via search for a word on their site is of this format: https://www.yourdictionary.com/<word> where <word> is the word string I searched for (in the snapshot below, “Abscond” was used).

Different sites employ different approaches. Sometimes a site may use a query parameter as part of the URL such as: http://www.yourdictionary.com?q=<word> so form the URL in your code accordingly.

With that information, I need to inspect each page’s general structure. Upon inspection, I see that the definitions are in Meanings tab of the page, and its line appears after several <div> elements is wrapped in a nested div with a specific class. I make a note of that because in the code, that’s where I need to search to retrieve the definition or meaning line of text (You’ll see the code soon below).

So far, I got the definition, but for example sentences, it goes to a different path which is formatted as http://sentence.yourdictionary.com/<word>. To find the example sentences, they’re in a different tab “Sentences” and that document is structured differently. This time, I find another nested <div> element with class “sentence-item” that is actually an array of different examples. I make a note of this too.

Now that I have the suspected class names and their tags, it’s time to code and give it a try!

So, next I will take a query word as an input in my script, then form the URLs by concatenating the word input according to the URL formats I stated above. Then for the definition, I’ll search for the content of its element (nested DIV with class=’text-black’) and for example sentence, I’ll search for the ‘sentence-item’ class in its respective URL. If I get some data back, great! I’ll present them in the Shell output using print(). Because I only want up to four example sentences, and no more, I slice the findings in the for loop.

Example outputs:

Enter a word: Akimbo

Meaning of the word ▶Akimbo◀:
‣ In or into a position in which the hands are on the hips and the elbows are bowed outward.

Example sentences using the word Akimbo:

  • The yard porter, his arms akimbo, stood smiling with satisfaction before the large mirror.
  • Anatole, having taken off his overcoat, sat with arms akimbo before a table on a corner of which he smilingly and absent-mindedly fixed his large and handsome eyes.
  • Natasha threw off the shawl from her shoulders, ran forward to face “Uncle,” and setting her arms akimbo also made a motion with her shoulders and struck an attitude.
  • The handsome young soldier who had brought the wood, setting his arms akimbo, began stamping his cold feet rapidly and deftly on the spot where he stood.

Enter a word: Hokum

Meaning of the word ▶Hokum◀:
‣ Trite or mawkish sentiment, crude humor, etc. used to get a quick emotional response from an audience.

Example sentences using the word Hokum:

  • It was pure theater by a master and you could see it as marvelously sincere and spontaneous or absolute hokum.
  • Of course it is all complete hokum in the best Hollywood sense.
  • I have to admit that it took me some time to cast off this initial belief in such hokum.
  • In early episodes, Deanna Troi’s utility was mainly in studying the alien vessel in view on the main screen with a far off look and intoning “I sense a great fear… and sadness…” or similar hokum.

Enter a word: Doff

Meaning of the word ▶Doff◀:
‣ To take off (clothes, etc.)
No example sentence found.

Enter a word: Mandri
No definition found.
No example sentence found.

As you can see, wherever the definition and example sentences are found, the program outputs their respective text with exactly up to 4 example usages. If the word was found but no example sentences were found, the script tells us so, as in the case with the word ‘Doff’ (at the time of this writing). Furthermore, if a word was not found (as in the case with ‘Mandri’ in the example above), the script tells us that no definition, and no example sentence were found.

This is very handy script for me personally as I don’t have to open a browser, navigate to the site, enter the word, and do a search, then click on Meanings, then on Sentences tabs separately…all without ads too!

The Code

By now, you should be clamoring for the code. I’ll share the code by taking out the URL so you can adjust the actual URL accordingly in your code (besides, we don’t want to overload this poor site and get your IP blocked).

I hope you found this post helpful and interesting. Explore this site for more tips and articles. Be sure to also check out my Patreon site where you can find free downloads and optional fee-based code and documentation. Thanks for visiting!

There are several related articles I’ve written in the past that you might be interested in. They are linked below — be sure to check them out!

Related:

Back To Top