Web Iota

Iota is a web scraper which can find all of the images and links/suburls on a webpage. To reach this goal, I used some python libraries such as Selenium, Request, and Beautifulsoup

Iota 1

Supports scraping images and links
Using request lib and Beautifulsoup
Unable to parse Javascript

Iota 2

Requires Selenium PhantomJS Driver
Using request lib, selenium, and Beautifulsoup
Able to parse JavaScript
Able to scrape most of the anti-scraping websites

Usage

Try to type python iota1.py -h

usage: iota.py [-h] [-img] [-all_img] [-link] [url]

positional arguments:
  url         The URL of the target website/webpage

optional arguments:
  -h, --help  show this help message and exit
  -img        Find all of the image on the webpage
  -all_img    Find all of the image on the webpage and subwebpages
  -link       Find all of the suburls/links on the webpage

Example: python iota2.py -img https://www.w3schools.com/html/html_classes.asp

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
iota1.py		iota1.py
iota2.py		iota2.py
phantomjs.exe		phantomjs.exe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Iota

Iota 1

Iota 2

Usage

About

Releases

Packages

Languages

Valcan3344/Web-Iota

Folders and files

Latest commit

History

Repository files navigation

Web Iota

Iota 1

Iota 2

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages