Metadata-Version: 2.4
Name: krawly
Version: 1.0.0
Summary: Official Python SDK for Krawly — AI-powered web scraping platform
Author-email: Krawly <support@krawly.io>
License: MIT
Project-URL: Homepage, https://krawly.io
Project-URL: Documentation, https://docs.krawly.io
Project-URL: Repository, https://github.com/krawly/krawly-python
Project-URL: Bug Tracker, https://github.com/krawly/krawly-python/issues
Keywords: web-scraping,AI,scraper,data-extraction,krawly,YAML,automation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Internet :: WWW/HTTP :: Dynamic Content
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.28.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: local
Requires-Dist: playwright>=1.40.0; extra == "local"
Requires-Dist: beautifulsoup4>=4.12.0; extra == "local"
Requires-Dist: lxml>=4.9.0; extra == "local"
Requires-Dist: curl-cffi>=0.5.0; extra == "local"
Dynamic: license-file

# Krawly — AI-Powered Web Scraping SDK

[![PyPI version](https://badge.fury.io/py/krawly.svg)](https://pypi.org/project/krawly/)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

**Turn any website into structured data with AI.** No complex selectors, no external API keys — just describe what you want in plain English.

## Installation

```bash
pip install krawly
```

## Quick Start

```python
from krawly import Krawly

# Initialize with your API key (get one at https://krawly.io)
client = Krawly(api_key="sai_your_key_here")

# One-line scraping — the simplest way
result = client.scrape(
    "https://books.toscrape.com",
    "Get all book titles and prices"
)

for item in result.data:
    print(f"{item['title']}: {item['price']}")

print(f"Total: {result.row_count} items")
```

## Features

- 🤖 **AI-Powered** — Describe what you want in plain English, the AI handles the rest
- 🔧 **No External Keys** — Only your Krawly API key needed, no Claude/OpenAI keys
- 📦 **Config Management** — Save, list, download, and reuse scraping configs
- 🚀 **Server Execution** — Run scrapers on Krawly's cloud infrastructure
- 📁 **Local YAML** — Read, write, and upload YAML configs from local files
- 📊 **Progress Tracking** — Real-time progress callbacks during scraping

## Usage

### One-Line Scraping

```python
result = client.scrape("https://example.com/products", "Get product names, prices, and ratings")
print(result.data)  # [{"name": "...", "price": "...", "rating": "..."}]
```

### Step-by-Step Control

```python
# Step 1: Generate a config
job = client.generate("https://example.com/products", "Get all product details")

# Step 2: Wait with progress updates
def on_progress(status):
    print(f"[{status.progress}%] {status.status_message}")

final = client.wait_for_completion(job.job_id, on_progress=on_progress)
print(f"Config generated: {final.config_name}")
print(final.yaml_content)

# Step 3: Run the scraper
run = client.run(final.config_id)
result = client.wait_and_get_results(run.job_id)
print(f"Scraped {result.row_count} items")
```

### Config Management

```python
# List all your configs
configs = client.list_configs()
for c in configs:
    print(f"{c.name} — {c.target_url}")

# Get a specific config
config = client.get_config("config-uuid-here")
print(config.yaml_content)

# Create a new config
config = client.create_config(
    name="My Scraper",
    target_url="https://example.com",
    prompt="Get all items",
    yaml_content="url: https://example.com\n..."
)

# Delete a config
client.delete_config("config-uuid-here")
```

### Local YAML Files

```python
# Read a local YAML file and run it on the server
result = client.scrape_with_file("my_config.yaml")
print(result.data)

# Download a config from server to local file
client.download_config("config-uuid-here", "downloaded_config.yaml")

# Upload a local YAML file to the server
config = client.upload_config("my_config.yaml", name="My Config")
print(f"Uploaded as: {config.id}")

# Load and parse YAML locally
content = Krawly.load_yaml("config.yaml")
parsed = Krawly.parse_yaml(content)
```

### Run YAML Content Directly

```python
yaml_content = \"""
url: https://books.toscrape.com
selectors:
  items: article.product_pod
  fields:
    title: h3 a::attr(title)
    price: .price_color::text
\"""

result = client.scrape_with_yaml(yaml_content)
for book in result.data:
    print(book)
```

### Account Info

```python
info = client.me()
print(f"Plan: {info.plan}")
print(f"Credits remaining: {info.generations_remaining}/{info.generations_limit}")
```

## Error Handling

```python
from krawly import Krawly
from krawly.client import AuthenticationError, QuotaExceededError, RateLimitError, KrawlyError

try:
    result = client.scrape("https://example.com", "Get data")
except AuthenticationError:
    print("Invalid API key")
except QuotaExceededError:
    print("No credits remaining — upgrade your plan")
except RateLimitError:
    print("Too many requests — try again later")
except KrawlyError as e:
    print(f"API error: {e}")
```

## Plans & Pricing

| Plan | Credits | Server Execution | Price |
|------|---------|-------------------|-------|
| Free | 3/month | ✗ | $0 |
| Starter | 20/month | ✓ | $15/mo |
| Pro | 100/month | ✓ | $29/mo |

All plans include API, SDK, and Chrome Extension access.

Get your API key at [krawly.io](https://krawly.io)

## Documentation

Full documentation: [docs.krawly.io](https://docs.krawly.io)

## License

MIT
