Metadata-Version: 2.1
Name: pa-scraper
Version: 0.1.2
Summary: Python wrapper for Prompt API's Scraper API
Home-page: https://github.com/promptapi/scraper-py
Author: Prompt API
Author-email: hello@promptapi.com
License: MIT
Project-URL: Prompt API, https://promptapi.com
Project-URL: Scraper API, https://promptapi.com/marketplace/description/scraper-api
Project-URL: Source, https://github.com/promptapi/scraper-py
Keywords: promptapi,scrape,extract,parse,download
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Text Processing :: General
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: requests
Provides-Extra: development
Requires-Dist: vb-console ; extra == 'development'

![Python](https://img.shields.io/badge/python-3.7.4-green.svg)
![Version](https://img.shields.io/badge/version-0.1.2-orange.svg)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Build Status](https://travis-ci.org/promptapi/scraper-py.svg?branch=main)](https://travis-ci.org/promptapi/scraper-py)

# Prompt API - Scraper API - Python Package

`pa-scraper` is a python wrapper for [scraper api][scraper-api] with few
more extra cream and sugar.

## Requirements

1. You need to signup for [Prompt API][promptapi-signup]
1. You need to subscribe [scraper api][scraper-api], test drive is **free!!!**
1. You need to set `PROMPTAPI_TOKEN` environment variable after subscription.

then;

```bash
$ pip install pa-scraper
```

---

## Example Usage

Examples can be found [here][examples].

```python
from scraper import Scraper

url = 'https://pypi.org/classifiers/'
scraper = Scraper(url)
response = scraper.get()

if response.get('error', None):
    # response['error']  returns error message
    # response['status'] returns http status code
    # {'error': 'Not Found', 'status': 404}
    print(response)
else:
    result = response['result']

    print(result['headers'])   # returns response headers 
    print(result['data'])      # returns fetched html
    print(result['url'])       # returns fetched url
    print(response['status'])  # returns http status code

    save_result = scraper.save('/tmp/my-html.html')  # save to file
    if save_result.get('error', None):
        # we have save error
        pass
    else:
        print(save_result)    # contains saved file path and file size
        # {'file': '/tmp/my-html.html', 'size': 321322}
```

You can add url parameters for extra operations. Valid parameters are:

- `auth_password`: for HTTP Realm auth password
- `auth_username`: for HTTP Realm auth username
- `cookie`: URL Encoded cookie header.
- `country`: 2 character country code. If you wish to scrape from an IP address of a specific country.
- `referer`: HTTP referer header

```python
from scraper import Scraper

    url = 'https://pypi.org/classifiers/'
    scraper = Scraper(url)

    fetch_params = dict(country='EE')
    response = scraper.get(params=fetch_params)

    if response.get('error', None):
        # response['error']  returns error message
        # response['status'] returns http status code
        # {'error': 'Not Found', 'status': 404}
        print(response)
    else:
        result = response['result']
        status = response['status']

        print(result['headers'])   # returns response headers 
        print(result['data'])      # returns fetched html
        print(result['url'])       # returns fetched url
        print(response['status'])  # returns http status code

        save_result = scraper.save('/tmp/my-html.html')  # save to file
        if save_result.get('error', None):
            # we have save error
            pass
        else:
            print(save_result)    # contains saved file path and file size
            # {'file': '/tmp/my-html.html', 'size': 321322}
```

---

## TODO

- Add `xpath` extractor.

## License

This project is licensed under MIT

---

## Contributer(s)

* [Prompt API](https://github.com/promptapi) - Creator, maintainer

---

## Contribute

All PR’s are welcome!

1. `fork` (https://github.com/promptapi/scraper-py/fork)
1. Create your `branch` (`git checkout -b my-feature`)
1. `commit` yours (`git commit -am 'Add awesome features...'`)
1. `push` your `branch` (`git push origin my-feature`)
1. Than create a new **Pull Request**!

This project is intended to be a safe,
welcoming space for collaboration, and contributors are expected to adhere to
the [code of conduct][coc].

---

[scraper-api]:      https://promptapi.com/marketplace/description/scraper-api
[promptapi-signup]: https://promptapi.com/#signup-form
[coc]:              https://github.com/promptapi/scraper-py/blob/main/CODE_OF_CONDUCT.md
[examples]:         https://github.com/promptapi/scraper-py/blob/main/examples/


