Metadata-Version: 2.4
Name: xnoxs-engine
Version: 1.1.0
Summary: A Python library for searching across multiple search engines (Google, Bing, DuckDuckGo, Yahoo, Mojeek, Brave) with caching, rate limiting, and proxy support
Project-URL: Homepage, https://github.com/developerxnoxs/SearchEngine
Project-URL: Documentation, https://github.com/developerxnoxs/SearchEngine#readme
Project-URL: Repository, https://github.com/developerxnoxs/SearchEngine.git
Project-URL: Issues, https://github.com/developerxnoxs/SearchEngine/issues
Project-URL: Changelog, https://github.com/developerxnoxs/SearchEngine/blob/main/CHANGELOG.md
Author: Multi Search Engine Contributors
Maintainer: Multi Search Engine Contributors
License: MIT License
        
        Copyright (c) 2024 Multi Search Engine Contributors
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: bing,brave,duckduckgo,google,mojeek,scraper,scraperapi,search,search-engine,web-scraping,yahoo
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: beautifulsoup4>=4.12.0
Requires-Dist: build>=1.3.0
Requires-Dist: requests>=2.28.0
Requires-Dist: twine>=6.2.0
Provides-Extra: dev
Requires-Dist: black>=23.0.0; extra == 'dev'
Requires-Dist: flake8>=6.0.0; extra == 'dev'
Requires-Dist: isort>=5.12.0; extra == 'dev'
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pre-commit>=3.0.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: markdown>=3.4.0; extra == 'docs'
Requires-Dist: pdoc3>=0.10.0; extra == 'docs'
Provides-Extra: publish
Requires-Dist: build>=1.0.0; extra == 'publish'
Requires-Dist: twine>=4.0.0; extra == 'publish'
Description-Content-Type: text/markdown

# Multi Search Engine

[![PyPI version](https://badge.fury.io/py/multi-search-engine.svg)](https://badge.fury.io/py/multi-search-engine)
[![Python Versions](https://img.shields.io/pypi/pyversions/multi-search-engine.svg)](https://pypi.org/project/multi-search-engine/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![CI](https://github.com/yourusername/multi-search-engine/actions/workflows/ci.yml/badge.svg)](https://github.com/yourusername/multi-search-engine/actions/workflows/ci.yml)
[![codecov](https://codecov.io/gh/yourusername/multi-search-engine/branch/main/graph/badge.svg)](https://codecov.io/gh/yourusername/multi-search-engine)
[![Downloads](https://pepy.tech/badge/multi-search-engine)](https://pepy.tech/project/multi-search-engine)

Library Python yang powerful untuk melakukan pencarian di berbagai mesin pencari dengan interface yang seragam. Mendukung Google, Bing, DuckDuckGo, Yahoo, Mojeek, dan Brave dengan fitur caching, rate limiting, dan dukungan proxy.

## Fitur

- **6 Mesin Pencari**: Google, Bing, DuckDuckGo, Yahoo, Mojeek, Brave
- **Interface Seragam**: API yang sama untuk semua mesin pencari
- **Pencarian 1 Baris**: `quick_search()` untuk kemudahan penggunaan
- **Visit URL**: Ambil konten halaman dari hasil pencarian
- **Caching**: Cache berbasis file dan memory
- **Rate Limiting**: Rate limiter bawaan dengan exponential backoff
- **Dukungan Proxy**: Bekerja dengan proxy kustom dan ScraperAPI
- **Filter Hasil**: Filter berdasarkan kata kunci, domain, atau batasi jumlah
- **Opsi Export**: Export ke JSON atau format dictionary
- **Type Hints**: Anotasi tipe lengkap untuk dukungan IDE yang lebih baik
- **Penanganan Error**: Exception handling yang komprehensif

## Prasyarat

- **Python 3.9** atau lebih baru
- **pip** (biasanya sudah terinstall dengan Python)

Cek versi Python:
```bash
python --version  # Harus 3.9+
```

## Instalasi

```bash
pip install xnoxs-engine
```

## Memulai Cepat

### Cara Paling Mudah (1 Baris!)

```python
from SearchEngine import quick_search

# Pencarian cepat dalam 1 baris
results = quick_search("Python programming")

# Dengan opsi engine
results = quick_search("machine learning", engine="brave", num_results=5)
```

### Cara Standar

```python
from SearchEngine import DuckDuckGoSearch

# Buat instance search engine
ddg = DuckDuckGoSearch()

# Lakukan pencarian
results = ddg.search("Python programming", num_results=10)

# Proses hasil
for result in results:
    print(f"Judul: {result.title}")
    print(f"URL: {result.url}")
    print(f"Deskripsi: {result.description}")
    print()
```

### Dengan Context Manager

```python
from SearchEngine import DuckDuckGoSearch, FileCache

# Otomatis cleanup setelah selesai
with DuckDuckGoSearch(cache=FileCache()) as ddg:
    results = ddg.search("Python tutorial")
    for result in results:
        print(result.title)
```

## Mesin Pencari yang Didukung

| Engine | Class | Tanpa Proxy | Dengan ScraperAPI | Rekomendasi |
|--------|-------|-------------|-------------------|-------------|
| DuckDuckGo | `DuckDuckGoSearch` | Stabil | Stabil | Langsung (tanpa proxy) |
| Yahoo | `YahooSearch` | Stabil | Stabil | Langsung (tanpa proxy) |
| Mojeek | `MojeekSearch` | Stabil | Stabil | Langsung (tanpa proxy) |
| Brave | `BraveSearch` | Stabil | Stabil | Langsung (tanpa proxy) |
| Bing | `BingSearch` | Mungkin perlu captcha | OK | Gunakan ScraperAPI |
| Google | `GoogleSearch` | Diblokir | OK | ScraperAPI diperlukan |

> **Catatan:** Google dan Bing aktif memblokir permintaan otomatis. Gunakan ScraperAPI untuk hasil yang andal.

## Contoh Penggunaan

### Pencarian Dasar

```python
from SearchEngine import DuckDuckGoSearch, BingSearch, GoogleSearch

# DuckDuckGo (tanpa proxy)
ddg = DuckDuckGoSearch()
results = ddg.search("Python programming", num_results=10)

# Dengan parameter pencarian
results = ddg.search(
    query="machine learning",
    page=1,
    num_results=10,
    language="id",
    country="ID",
    safe_search=True
)
```

### Menggunakan ScraperAPI (untuk Google/Bing)

```python
from SearchEngine import GoogleSearch

google = GoogleSearch(scraper_api_key="API_KEY_ANDA")
results = google.search("Python programming")

for result in results:
    print(f"{result.title}: {result.url}")
```

### Caching Hasil

```python
from SearchEngine import DuckDuckGoSearch, FileCache, MemoryCache

# Cache berbasis file (persisten)
file_cache = FileCache(cache_dir=".search_cache", default_ttl=3600)
ddg = DuckDuckGoSearch(cache=file_cache)

# Cache memory (hanya sesi)
mem_cache = MemoryCache(default_ttl=600)
ddg = DuckDuckGoSearch(cache=mem_cache)

# Hasil otomatis di-cache
results = ddg.search("Tutorial Python")
```

### Rate Limiting

```python
from SearchEngine import BingSearch, RateLimiter

limiter = RateLimiter(
    requests_per_minute=10,
    min_delay=1.0,
    max_delay=60.0
)

bing = BingSearch(rate_limiter=limiter)
results = bing.search("web development")
```

### Menggunakan Proxy Kustom

```python
from SearchEngine import YahooSearch

yahoo = YahooSearch(proxy="http://proxy.example.com:8080")
results = yahoo.search("berita teknologi")
```

### Filter Hasil

```python
from SearchEngine import BraveSearch

brave = BraveSearch()
results = brave.search("tutorial programming", num_results=20)

# Filter berdasarkan kata kunci
python_results = brave.filter_by_keyword("python")

# Filter berdasarkan domain
github_results = brave.filter_by_domain("github.com")

# Batasi hasil
top_5 = brave.limit_results(5)
```

### Export Hasil

```python
from SearchEngine import DuckDuckGoSearch

ddg = DuckDuckGoSearch()
results = ddg.search("data science")

# Export ke list of dictionaries
data = ddg.to_dict_list()

# Export ke JSON string
json_str = ddg.to_json(indent=2)

# Simpan ke file
with open("results.json", "w") as f:
    f.write(ddg.to_json())
```

### Pencarian di Semua Engine Sekaligus

```python
from SearchEngine import search_all_engines

# Cari di semua engine (DuckDuckGo, Yahoo, Mojeek, Brave)
result = search_all_engines("Python tutorial")

# Iterasi hasil yang berhasil
for engine, items in result.items():
    print(f"{engine}: {len(items)} hasil")

# Cek dan handle error per engine
if result.has_errors():
    for engine, error in result.errors.items():
        print(f"{engine} gagal: {error}")

# Pilih engine tertentu
result = search_all_engines("AI", engines=["duckduckgo", "brave"])

# Raise exception jika ada error
result = search_all_engines("AI", raise_on_error=True)
```

### Lihat Engine yang Tersedia

```python
from SearchEngine import get_available_engines

engines = get_available_engines()
print(engines)  # ['google', 'bing', 'duckduckgo', 'yahoo', 'mojeek', 'brave']
```

### Kunjungi URL Hasil Pencarian

Fitur baru untuk mengambil konten dari URL yang ditemukan:

```python
from SearchEngine import quick_search, visit_url

# Cara 1: Dari hasil pencarian
results = quick_search("Python tutorial")
page = results[0].visit()  # Kunjungi URL pertama

print(f"Judul: {page.title}")
print(f"Konten: {page.text[:500]}")  # 500 karakter pertama

# Cara 2: Langsung dengan URL
page = visit_url("https://python.org")
print(f"Title: {page.title}")
print(f"Status: {page.status_code}")

# Cek error
if not page.success:
    print(f"Error: {page.error}")

# Preview teks dengan panjang tertentu
print(page.get_text_preview(200))  # 200 karakter
```

**Properti PageContent:**
| Properti | Tipe | Deskripsi |
|----------|------|-----------|
| `url` | str | URL halaman |
| `title` | str | Judul halaman |
| `text` | str | Teks bersih (tanpa HTML) |
| `html` | str | HTML mentah |
| `status_code` | int | HTTP status code |
| `success` | bool | True jika berhasil |
| `error` | str | Pesan error (jika gagal) |

### Penanganan Error

```python
from SearchEngine import (
    DuckDuckGoSearch,
    NetworkException,
    ParseException,
    BlockedException
)

ddg = DuckDuckGoSearch()

try:
    results = ddg.search("query")
except NetworkException as e:
    print(f"Error jaringan: {e}")
except ParseException as e:
    print(f"Gagal parsing hasil: {e}")
except BlockedException as e:
    print(f"Diblokir oleh mesin pencari: {e}")
```

## Referensi API

### Parameter SearchEngine

| Parameter | Tipe | Default | Deskripsi |
|-----------|------|---------|-----------|
| `user_agent` | str | Random | String user agent kustom |
| `proxy` | str | None | URL Proxy |
| `timeout` | int | 30 | Timeout request dalam detik |
| `delay` | float | 1.0 | Delay antar request |
| `cache` | CacheInterface | None | Instance cache |
| `rate_limiter` | RateLimiter | None | Instance rate limiter |
| `scraper_api_key` | str | None | API key ScraperAPI |

### Parameter Method Search

| Parameter | Tipe | Default | Deskripsi |
|-----------|------|---------|-----------|
| `query` | str | Wajib | Query pencarian |
| `page` | int | 1 | Nomor halaman |
| `num_results` | int | 10 | Hasil per halaman |
| `language` | str | None | Kode bahasa (contoh: 'id', 'en') |
| `country` | str | None | Kode negara (contoh: 'ID', 'US') |
| `safe_search` | bool | True | Aktifkan safe search |
| `use_cache` | bool | True | Gunakan hasil cache |

### Properti SearchResult

| Properti | Tipe | Deskripsi |
|----------|------|-----------|
| `title` | str | Judul hasil |
| `url` | str | URL hasil |
| `description` | str | Deskripsi/snippet hasil |
| `position` | int | Posisi dalam hasil pencarian |
| `engine` | str | Nama mesin pencari |
| `extra` | dict | Metadata tambahan |

## Struktur Proyek

```
SearchEngine/
├── SearchEngine/
│   ├── __init__.py
│   ├── base.py              # Base class dan SearchResult
│   ├── cache.py             # FileCache dan MemoryCache
│   ├── rate_limiter.py      # RateLimiter
│   ├── exceptions.py        # Custom exceptions
│   └── engines/
│       ├── __init__.py
│       ├── google.py
│       ├── bing.py
│       ├── duckduckgo.py
│       ├── yahoo.py
│       ├── mojeek.py
│       └── brave.py
├── tests/
├── .github/
│   └── workflows/
├── pyproject.toml
├── README.md
├── LICENSE
├── CHANGELOG.md
└── CONTRIBUTING.md
```

## Contoh Output

```python
>>> from SearchEngine import quick_search
>>> results = quick_search("Python", num_results=2)
>>> results[0].title
'Welcome to Python.org'
>>> results[0].url
'https://www.python.org/'
>>> results[0].description
'The official home of the Python Programming Language...'

>>> # Visit halaman
>>> page = results[0].visit()
>>> page.success
True
>>> page.title
'Welcome to Python.org'
```

## Troubleshooting

| Masalah | Penyebab | Solusi |
|---------|----------|--------|
| `ModuleNotFoundError: No module named 'SearchEngine'` | Library belum diinstall | Jalankan `pip install xnoxs-engine` |
| `NetworkException: Connection error` | Tidak ada koneksi internet | Cek koneksi internet Anda |
| `BlockedException: Captcha detected` | IP diblokir search engine | Gunakan proxy atau ScraperAPI |
| Hasil kosong dari Google | Google memblokir request | Gunakan `scraper_api_key` parameter |
| `TimeoutError` | Server terlalu lambat | Tambah parameter `timeout=60` |
| `ParseException` | Format HTML berubah | Update library ke versi terbaru |

**Tips:**
- Mulai dengan DuckDuckGo atau Brave (tidak perlu proxy)
- Gunakan caching untuk mengurangi request
- Tambahkan delay antar pencarian untuk menghindari rate limit

## Kontribusi

Kontribusi sangat diterima! Silakan baca [Panduan Kontribusi](CONTRIBUTING.md) untuk detail tentang cara mengirim pull request, melaporkan masalah, dan berkontribusi pada proyek.

## Keamanan

Untuk masalah keamanan, silakan baca [Kebijakan Keamanan](SECURITY.md).

## Lisensi

Proyek ini dilisensikan di bawah MIT License - lihat file [LICENSE](LICENSE) untuk detail.

## Changelog

Lihat [CHANGELOG.md](CHANGELOG.md) untuk daftar perubahan.

## Ucapan Terima Kasih

- Terima kasih kepada semua kontributor
- Terinspirasi dari kebutuhan akan interface pencarian yang seragam
- Dibangun dengan Python dan cinta
