Download specific book category from Hindawi organization's website in various formats.
Clone the repository with SSH or HTTPS:
- SSH
git clone [email protected]:McCdama/scrapySpider.git
- HTTPS
git clone https://github.com/McCdama/scrapySpider.git
Open terminal inside the project's folder
pip install scrapy
Run:
scrapy crawl hindawi
To scrap another category change in
hindawi.py
both linesLine 11
andLine 14
to the desired link categotry.
The downloaded files will be located in the so-called "DownloadsFolders" within the root project
Supported extension formats are following: .kfx, .pdf and .epub
To open .epub extension, download ADE [Adobe Digital Edition]
To download different extension files, change the extension name in
hindawi.py
onLine 21
.