Table Of Contents
1. What Is Simple Scraper?
2. What Is Books To Scrape?
3. What You’ll Learn In This (No-Code) Tutorial
4. Setting Up Your Web Scraper
5. Running Your Web Scraper
6. Saving Your Web Scraper
7. Setting Up Your Web Crawler
8. Running Your Crawler
9. Next Steps
What Is Simple Scraper?
Simple Scraper is an excellent Google Chrome extension which makes web crawling easy. It helps you to extract data from any website with no-code. You can crawl locally or in the cloud. And, every website that you crawl, instantly becomes an API. Simple Scraper is a simple yet powerful web crawling tool.
What Is Books To Scrape?
Books To Scrape is a web crawling sandbox by Scraping Hub. The website is a fictional bookstore, ready for you to crawl. Books To Scrape provides a safe place for beginners to learn the fundamentals of web crawling.
What You’ll Learn In This (No-Code) Tutorial
By the end of this tutorial, you will have created a web crawler in Simple Scraper, that will allow you to crawl for data from a website (Books To Scrape).
Setting Up Your Web Scraper
Go-to Books To Scrape.
Open Simple Scraper, and click the plus (+) sign.
First, you’ll want to scrape the titles: select a title. Everything which gets highlighted is what’ll get extracted. Name this data, ‘Title’. Then, click the tick to set it for when you run the scraper.
Second, you’ll want to scrape the price of each book. Again, click the plus (+) sign. Then, select the price of a book. Everything which gets highlighted is what’ll get extracted. Name this data, ‘Price’. And, click the tick to set it for when you run the scraper.
Running Your Web Scraper
To run your scraper, click ‘View Results’.
Once the web scraper has run, Simple Scraper will return the selected data. It will allow you to view that data in a table or as a JSON file. And, you will have the option of downloading the data as either a CSV file or JSON.
Saving Your Web Scraper
You must save the settings for your scraper, before configuring your crawler.
To save your scraper, click ‘Save Recipe’.
You’ll have to confirm the settings for your scraper when saving it. The settings that got entered for this project are:
- Recipe Name - 'Books To Scrape'
- URL - 'https://books.toscrape.com/'
- Selected Properties - 'Title' and 'Price'
- Page Navigation - Leave that as it is
Once you’ve entered the settings, click ‘Create Recipe’.
Setting Up Your Web Crawler
Click on the recipe you saved under ‘My Recipes’.
Then, click ‘Crawl’.
Insert the URLs you want to crawl. For this project, they are as follows:
Running Your Crawler
To run your scraper, click ‘Run Recipe’.
Once the web crawler has run, Simple Scraper will return the selected data. You can view the output of your crawler on the ‘Results’ page.
You’ll notice that Simple Scraper has crawled through fives pages, and returned the selected data. You’ll get given the option to view that data in a table or as a JSON file. And, you’ll have the option of downloading the data too.
Congratulations on completing this tutorial. Now, why not challenge your capabilities? Try implementing one of the suggestions below. Or try your own.
- Crawl more pages on Books To Scrape.
- Crawl the rating for each book.
- Schedule your web crawler to run automatically.