January 7, 2021

Web Crawling With Simple Scraper

Ravinder Deol
Tutorial Time: 7 Minutes

Project Resources

Simple Scraper

Books To Scrape

In this no-code project, you'll create a web-crawler to crawl for data on various URLs. To begin, you will need to install Simple Scraper (you will need to be using Google Chrome).

For this project, you'll be scraping data from Books To Scrape. For each book, you're going to be extracting the title and the price of that book.

1. Initial Scraper Setup

Navigate to Books To Scrape.

Open Simple Scraper, and click the plus (+) sign.

First, you'll want to scrape the titles - select a title. Everything which gets highlighted is what'll get extracted. Name this data - 'Title'. And, click the tick to set it for when you run the scraper.

Second, you'll want to scrape the price of each book. Again, click the plus (+) sign. Then, select the price of a book. Everything which gets highlighted is what'll get scraped. Name this - 'Price'. And, click the tick to set it for when you run the scraper.

2. Running The Scraper

To run the scraper, click - 'View Results'.

Once the scraper has run, Simple Scraper will return the selected data.

3. Saving Your Settings

Before configuring your crawler, you must save the scraper's settings.

To save the scraper's settings, click - 'Save Recipe'. When saving a recipe, you will have to confirm the settings for the scraper. For this project, the following information got entered.

  • Recipe Name - 'Books To Scrape'.

  • URL - 'http://books.toscrape.com/catalogue/page-1.html'.

  • Selected Properties - 'Title' and 'Price'.

  • Page Navigation - Leave that as it is.

Once you've entered the information, click - 'Create Recipe'

4. Crawler Setup

Click onto the recipe you saved, under - 'My Recipes'. Then, click - 'Crawl'.

Insert the URLs you'd like to crawl. For this project, you're going to crawl the following URLs:

  • http://books.toscrape.com/catalogue/page-1.html

  • http://books.toscrape.com/catalogue/page-2.html

  • http://books.toscrape.com/catalogue/page-3.html

  • http://books.toscrape.com/catalogue/page-4.html

  • http://books.toscrape.com/catalogue/page-5.html

5. Running The Crawler

With the URLs inputted, click - 'Run Recipe'. Simple Scraper will crawl the URLs and return the selected data - the data being the 'Title' and the 'Price' of each book.

You can view the output in the 'Results' page. As you will see, the crawler will have been through five-pages and returned the data that you selected - 'Title' and 'Price'.

You get given the option to view that data in a table or as a JSON file. And, at the bottom, you have the option of downloading the data as either a CSV file or JSON.

Hi, I'm Ravinder

Get x1 New No-Code Tutorial Every Week

Enter your email, and join 2,041 creators who have access to a list of no-code resources. Oh, and I'll send you a no-code tutorial weekly.

You're in! Check your inbox for an email.
Something went wrong. Please try again.
Not convinced? View the tutorials.