Crawler integration with INSPIRE-HEP using scrapy project HEPCrawl.
This module allows scheduling of crawler jobs to a Scrapyd instance serving a Scrapy project. E.g. in this case the default scrapy project is HEPCrawl.
It integrates directly with invenio-workflows module to create workflows for every record harvested by the crawler.
This module is meant to use only with INSPIRE-HEP overlay. Use at own risk.
Full documentation is hosted here: http://pythonhosted.org/inspire-crawler/
See also documentation of HEPCrawl: http://pythonhosted.org/hepcrawl/