Skip to content

Download specific book category from Hindawi organization's website in various formats.

Notifications You must be signed in to change notification settings

McCdama/scrapySpider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DOWNLOAD PDF FILES WITH SCRAPY CRAWL SPIDER

Download specific book category from Hindawi organization's website in various formats.

Prerequisites

Downloading

Clone the repository with SSH or HTTPS:

  • SSH
git clone [email protected]:McCdama/scrapySpider.git
  • HTTPS
git clone https://github.com/McCdama/scrapySpider.git

Installing

Open terminal inside the project's folder

pip install scrapy

Running Project

Run:

scrapy crawl hindawi

Additional

To scrap another category change in hindawi.py both lines Line 11 and Line 14 to the desired link categotry.

The downloaded files will be located in the so-called "DownloadsFolders" within the root project

Supported extension formats are following: .kfx, .pdf and .epub

To open .epub extension, download ADE [Adobe Digital Edition]

To download different extension files, change the extension name in hindawi.py on Line 21.

About

Download specific book category from Hindawi organization's website in various formats.

Topics

Resources

Stars

Watchers

Forks

Languages