Metadata-Version: 2.1
Name: ioweb
Version: 0.0.8
Summary: Web Scraping Framework
Home-page: https://github.com/lorien/ioweb
Author: Gregory Petukhov
Author-email: lorien@lorien.name
Maintainer: Gregory Petukhov
Maintainer-email: lorien@lorien.name
License: MIT
Download-URL: https://github.com/lorien/ioweb/releases
Description: ## IOWeb Framework
        
        Python framework to build web crawlers.
        
        What we have at the moment:
        
         * system designed to run large number network threads (like 100 or 500) on
            on CPU core
         * built-in feature to combine things in chunks and then doing something with
            chunks (like mongodb bulk write)
         * asynchronous things are powered by gevent
         * network requests are handled with urllib3
         * urllib3 monkey-patched to extract cert details
         * urllib3 monkey-patched to not do domain resolving if domain IP has been provided
         * built-in stat module to count events, built-in logging into influxdb
         * retrying on errors
         * no tests
         * no documentation
        
        I am using ioweb to do bulk web scraping like crawling 500M pages in few days.
        
        
        ## Places to talk
        
         * [t.me/grablab](https://t.me/grablab) - English chat about web scraping
         * [t.me/grablab_ru](https://t.me/grablab_ru) - Russian chat about web scraping
        
Keywords: web scraping network crawling cralwer spider pycurl
Platform: UNKNOWN
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Description-Content-Type: text/markdown
