Metadata-Version: 2.1
Name: okami
Version: 0.2.0
Summary: A high-level web scraping framework
Home-page: https://github.com/ambrozic/okami
Author: ambrozic
Author-email: ambrozic@gmail.com
Maintainer: ambrozic
Maintainer-email: ambrozic@gmail.com
License: BSD
Project-URL: Code, https://github.com/ambrozic/okami
Project-URL: Documentation, https://ambrozic.github.io/okami
Keywords: scraping framework
Platform: UNKNOWN
Classifier: Programming Language :: Python
Classifier: Environment :: Web Environment
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Topic :: Internet
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Intended Audience :: Developers
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Provides-Extra: docs
Provides-Extra: tests
Requires-Dist: aiohttp (<4.0,>=3.2.0)
Requires-Dist: attrs (<19.0,>=18.1.0)
Requires-Dist: click (<7.0,>=6.0)
Requires-Dist: lxml (<5.0,>=4.0)
Requires-Dist: multidict (<5.0,>=4.0)
Requires-Dist: sqlitedict (<1.6,>=1.5)
Provides-Extra: docs
Requires-Dist: mkdocs-material (==3.0.3); extra == 'docs'
Requires-Dist: mkdocs (==1.0.1); extra == 'docs'
Requires-Dist: pygments (==2.2.0); extra == 'docs'
Requires-Dist: pymdown-extensions (==4.12); extra == 'docs'
Provides-Extra: tests
Requires-Dist: codecov (==2.0.15); extra == 'tests'
Requires-Dist: flake8 (==3.5.0); extra == 'tests'
Requires-Dist: markupsafe (==1.0); extra == 'tests'
Requires-Dist: pipdeptree (==0.13.0); extra == 'tests'
Requires-Dist: pytest-asyncio (==0.9.0); extra == 'tests'
Requires-Dist: pytest-cov (==2.5.1); extra == 'tests'
Requires-Dist: pytest-freezegun (==0.2.0); extra == 'tests'
Requires-Dist: pytest (==3.7.1); extra == 'tests'
Requires-Dist: pyyaml (==3.13); extra == 'tests'

# Okami

[![](https://img.shields.io/badge/docs-github-blue.svg)](https://ambrozic.github.io/okami)
[![](https://img.shields.io/pypi/pyversions/okami.svg)](https://pypi.python.org/pypi/okami)
[![](https://img.shields.io/pypi/v/okami.svg)](https://pypi.python.org/pypi/okami)
[![](https://img.shields.io/pypi/wheel/okami.svg)](https://pypi.python.org/pypi/okami)
[![](https://travis-ci.org/ambrozic/okami.svg?branch=master)](https://travis-ci.org/ambrozic/okami)
[![](https://codecov.io/github/ambrozic/okami/coverage.svg?branch=master)](https://codecov.io/github/ambrozic/okami)
[![](https://img.shields.io/pypi/l/okami.svg)](https://pypi.python.org/pypi/okami)

Okami is a high-level web scraping framework built entirely for Python 3.6+ using asynchronous model provided by standard library [asyncio](https://docs.python.org/3/library/asyncio.html) module with [aiohttp](https://docs.aiohttp.org) as a networking layer and [lxml](http://lxml.de) for parsing data.

Architecture is entirely modular and main components can be swapped out and replaced with custom implementations.

## Features

- complete website-wide page processing
- full scraping mode or delta mode scraping only unvisited pages
- immediate, on-demand or real-time page processing over HTTP API
- single page processing via command line
- lots of pipelines, middlewares and signals

Spiders are very simple implementations. Take a look at an example [here](https://github.com/ambrozic/okami/blob/master/okami/example.py#L14-L53).


## Quick start

- Install okami

  - `pip install okami`

- Run example web server

  - `OKAMI_SETTINGS=okami.cfg.example okami example server`

Open [localhost:8000](http://localhost:8000) and browse around a little. Quite a remarkable website. We will run our example spider against this website shortly and process few items.

- Run example spider

  - `OKAMI_SETTINGS=okami.cfg.example okami example spider`

Our example spider started and you can see it processing pages. Take a look at an example spider implementation [here](https://github.com/ambrozic/okami/blob/master/okami/example.py#L14-L53).


## Documentation

Read the rest of documentation [here](https://ambrozic.github.io/okami).


## License

Okami is licensed under a three clause BSD License. Full license text can be found [here](https://github.com/ambrozic/okami/blob/master/license).


