Metadata-Version: 2.1
Name: webarchiver
Version: 0.8.0
Summary: Python tool that allows you to take multiple full page screenshots of web pages without ads.
Home-page: https://github.com/Knuckles-Team/webarchiver
Author: Audel Rouhi
Author-email: knucklessg1@gmail.com
License: Unlicense
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: Public Domain
Classifier: Environment :: Console
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: Pillow (>=9.3.0)
Requires-Dist: beautifulsoup4 (>=4.11.2)
Requires-Dist: piexif (>=1.1.3)
Requires-Dist: selenium (>=4.7.2)
Requires-Dist: webdriver-manager (>=3.8.5)

# Webarchiver
*Version: 0.8.0*

Python tool that allows you to take full page screenshots of pages without ads

Supports batching by adding multiple links in a text file, or my adding links to command line separated by commas.

### Requirements:

One of the following browsers:

- Chrome/Chromium browser
- Firefox
- Selenoid Server

### Usage:
| Short Flag | Long Flag    | Description                                                 |
|------------|--------------|-------------------------------------------------------------|
| -h         | --help       | See Usage                                                   |
| -b         | --browser    | Specify browser: Chrome / Firefox / Selenoid                |
| -c         | --clean      | Convert mobile sites to regular site                        |
| -d         | --directory  | Location where the images will be saved                     |
|            | --dpi        | DPI for the image                                           |
| -e         | --executor   | Execution environmment: Local / Selenoid Host\|Selenoid URL |
| -f         | --file       | Text file to read the URLs from                             |
| -l         | --links      | Comma separated URLs (No spaces)                            |
| -i         | --image-type | Save images as PNG or JPEG                                  |
| -t         | --threads    | Number of threads to run concurrently                       |
| -u         | --url-filter | Filter URLs that contain this string                        |
| -z         | --zoom       | The zoom to use on the browser                              |


### Example:
```bash
webarchiver -c -f <links_file.txt> -l "<URL1,URL2,URL3>" -t <JPEG/PNG> -d "~/Downloads" -z 100 --dpi 1
```

```bash
webarchiver -c -f <links_file.txt> -l "<URL1,URL2,URL3>" -t <JPEG/PNG> -d "~/Downloads" -z 100 --dpi 1 --executor "selenoid|http://selenoid.com/wd/hub" --browser "Chrome"
```

#### Install Instructions
Install Python Package

```bash
python -m pip install webarchiver
```

#### Build Instructions
Build Python Package

```bash
sudo chmod +x ./*.py
pip install .
python setup.py bdist_wheel --universal
# Test Pypi
twine upload --repository-url https://test.pypi.org/legacy/ dist/* --verbose -u "Username" -p "Password"
# Prod Pypi
twine upload dist/* --verbose -u "Username" -p "Password"
```


