Metadata-Version: 2.4
Name: sanitizr
Version: 1.0.0
Summary: Clean URLs by removing tracking parameters and decoding redirects
Author: Sanitizr Contributors
License: GPL-3.0-or-later
Project-URL: Homepage, https://github.com/Jordonh18/sanitizr
Project-URL: Bug Tracker, https://github.com/Jordonh18/sanitizr/issues
Project-URL: Documentation, https://jordonh18.github.io/sanitizr
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Utilities
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: yaml
Requires-Dist: pyyaml>=6.0; extra == "yaml"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: isort>=5.12; extra == "dev"
Requires-Dist: flake8>=6.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: pyyaml>=6.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.4; extra == "docs"
Requires-Dist: mkdocs-material>=9.0; extra == "docs"
Requires-Dist: mkdocstrings>=0.19; extra == "docs"
Requires-Dist: mkdocstrings-python>=0.9; extra == "docs"
Dynamic: license-file

# Sanitizr - URL Cleaner

[![Python Tests](https://github.com/Jordonh18/sanitizr/actions/workflows/python-tests.yml/badge.svg)](https://github.com/Jordonh18/sanitizr/actions/workflows/python-tests.yml)
[![License: GPL-3.0](https://img.shields.io/badge/License-GPL%203.0-blue.svg)](https://opensource.org/licenses/GPL-3.0)

A powerful and modular URL cleaning library and CLI tool that removes tracking parameters and decodes redirects.

## Features

- 🧹 Clean URLs by removing tracking parameters
- 🔄 Decode redirect URLs (Google, Facebook, etc.)
- ⚙️ Customizable parameter whitelisting/blacklisting
- 🧰 Supports both Python API and CLI usage
- 📋 Process URLs from clipboard, files, or standard input
- 🔧 Configurable via JSON or YAML files

## Installation

You can install Sanitizr from PyPI:

```bash
pip install sanitizr
```

For development setup:

```bash
pip install -e ".[dev]"
```

## Quick Start

### Command Line

```bash
# Clean a single URL
cleanurl -u "https://example.com?id=123&utm_source=newsletter"

# Clean URLs from a file
cleanurl -i urls.txt -o cleaned_urls.txt

# Clean URLs from stdin
cat urls.txt | cleanurl > cleaned_urls.txt

# Use verbose output to see the changes
cleanurl -u "https://example.com?id=123&utm_source=newsletter" -v
```

### Python API

```python
from sanitizr.cleanurl import URLCleaner

cleaner = URLCleaner()
clean_url = cleaner.clean_url("https://example.com?id=123&utm_source=newsletter")
print(clean_url)  # https://example.com?id=123
```

## Configuration

Sanitizr can be configured via JSON or YAML files:

```yaml
# config.yaml
tracking_params:
  - custom_tracker
  - another_tracker
redirect_params:
  custom.com:
    - redirect
    - goto
whitelist_params:
  - keep_this_param
blacklist_params:
  - remove_this_param
```

Use the configuration with the `--config` option:

```bash
cleanurl -u "https://example.com?id=123&custom_tracker=abc" --config config.yaml
```

## License

Sanitizr is licensed under the GNU General Public License v3.0 or later - see the [LICENSE](LICENSE) file for details.
