Metadata-Version: 2.4
Name: unsearch
Version: 0.2.1
Summary: UnSearch Python SDK - Open-source Tavily alternative for AI search
Author-email: UnSearch <hello@unsearch.dev>
License: Apache-2.0
Project-URL: Homepage, https://unsearch.dev
Project-URL: Documentation, https://docs.unsearch.dev
Project-URL: Repository, https://github.com/rakesh1002/unsearch-python
Project-URL: Changelog, https://github.com/rakesh1002/unsearch-python/blob/main/CHANGELOG.md
Keywords: ai,search,rag,langchain,tavily,llm,agents
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: httpx>=0.24.0
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.1.0; extra == "langchain"
Provides-Extra: all
Requires-Dist: langchain-core>=0.1.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"

# UnSearch Python SDK

Comprehensive AI search API with full backend capabilities. Drop-in Tavily replacement with self-hosting and zero-retention options.

## Features

- **Tavily-compatible** - Drop-in replacement for Tavily API
- **Exa-compatible** - Neural/semantic search endpoints
- **RAG-optimized** - Built for retrieval augmented generation
- **70+ search engines** - Aggregated results from multiple sources
- **Topic monitoring** - Real-time web monitoring
- **Fact verification** - AI-powered fact-checking
- **Zero-retention** - Privacy-first architecture

## Installation

```bash
pip install unsearch

# With LangChain support
pip install unsearch[langchain]
```

## Quick Start

```python
from unsearch import UnSearchClient

client = UnSearchClient(api_key="your-api-key")

# Basic search
response = client.search("What is machine learning?")
for result in response.results:
    print(f"{result.title}: {result.url}")

# Search with AI answer
response = client.search(
    "What is RAG?",
    include_answer=True,
    max_results=5
)
print(response.answer)

# Q&A shortcut
answer = client.qna_search("What is the capital of France?")
print(answer)
```

## Migrate from Tavily

UnSearch is designed as a drop-in replacement for Tavily. Migration is simple:

### Option 1: Direct replacement

```python
# Before (Tavily)
from tavily import TavilyClient
client = TavilyClient(api_key="tvly-...")

# After (UnSearch)
from unsearch import UnSearchClient
client = UnSearchClient(api_key="uns-...")
```

### Option 2: Alias (minimal code changes)

```python
# Add this import alias
from unsearch import UnSearchClient as TavilyClient

# Your existing code works unchanged
client = TavilyClient(api_key="uns-...")
response = client.search("query")
```

## LangChain Integration

```python
from unsearch.langchain import UnSearchResults

# Create tool
tool = UnSearchResults(
    api_key="your-api-key",
    max_results=5,
    include_answer=True
)

# Use in agent
results = tool.invoke("What is LangChain?")
```

### Migrate from TavilySearchResults

```python
# Before
from langchain_community.tools import TavilySearchResults
tool = TavilySearchResults(api_key="tvly-...")

# After (option 1)
from unsearch.langchain import UnSearchResults
tool = UnSearchResults(api_key="uns-...")

# After (option 2: alias)
from unsearch.langchain import UnSearchResults as TavilySearchResults
tool = TavilySearchResults(api_key="uns-...")
```

## Self-Hosted Instance

```python
# Point to your self-hosted UnSearch
client = UnSearchClient(
    api_key="your-api-key",
    base_url="https://your-unsearch-instance.com"
)

# Or use environment variable
# export UNSEARCH_BASE_URL=https://your-unsearch-instance.com
```

## Zero-Retention Mode

For privacy-sensitive applications:

```python
# Enable zero-retention (no data stored on server)
client = UnSearchClient(
    api_key="your-api-key",
    zero_retention=True
)

# Or per-request via header
# X-Zero-Retention: true
```

## Async Support

```python
from unsearch import AsyncUnSearchClient

async def search():
    async with AsyncUnSearchClient(api_key="your-key") as client:
        response = await client.search("async search query")
        return response.results
```

## API Reference

### Agent API (Tavily-compatible)

| Method | Description |
|--------|-------------|
| `search(query, **options)` | AI-optimized web search |
| `extract(urls, **options)` | Extract content from URLs |
| `research(query, **options)` | Multi-step deep research |
| `qna_search(query)` | Quick Q&A shortcut |
| `get_search_context(query, max_results)` | Get RAG context |
| `list_models()` | List available AI models |

### Core Search API

| Method | Description |
|--------|-------------|
| `core_search(query, **options)` | Full-featured search |
| `batch_search(queries, **options)` | Search multiple queries |
| `list_engines()` | List search engines |

### RAG API

| Method | Description |
|--------|-------------|
| `rag_research(topic, **options)` | Deep RAG research |
| `rag_search(query, **options)` | Quick RAG search |
| `semantic_search(corpus_id, query)` | Semantic search over corpus |
| `image_search(query)` | Image search |
| `list_corpora()` | List research corpora |

### Neural API (Exa-compatible)

| Method | Description |
|--------|-------------|
| `neural_search(query, **options)` | Neural/semantic search |
| `find_similar(url=None, text=None)` | Find similar content |
| `extract_highlights(query, content)` | Extract key highlights |
| `predictive_search(context)` | Predict next search |

### Topic Monitoring API

| Method | Description |
|--------|-------------|
| `create_monitor(topic, **options)` | Create topic monitor |
| `list_monitors()` | List all monitors |
| `get_monitor(monitor_id)` | Get a monitor |
| `pause_monitor(monitor_id)` | Pause a monitor |
| `resume_monitor(monitor_id)` | Resume a monitor |
| `delete_monitor(monitor_id)` | Delete a monitor |
| `get_monitor_results(monitor_id)` | Get monitor results |

### Enhanced API (Advanced scraping)

| Method | Description |
|--------|-------------|
| `enhanced_search(query, **options)` | Advanced search with extraction |
| `enhanced_scrape(urls, **options)` | Direct URL scraping |
| `enhanced_features()` | List enhanced features |
| `extract_tables(html_content)` | Extract tables from HTML |
| `chunk_content(text, strategy)` | Chunk text for RAG |
| `discover_urls(base_url, source)` | Discover URLs from sitemaps |

### Knowledge Graph API

| Method | Description |
|--------|-------------|
| `knowledge_extract(text)` | Extract entities & relationships |
| `knowledge_search(query)` | Search knowledge graph |
| `knowledge_people(query)` | Search for people |
| `knowledge_get_entity(entity_id)` | Get specific entity |
| `knowledge_graph()` | Get graph structure |

### Agent Registration API

| Method | Description |
|--------|-------------|
| `register_agent(name)` | Register new AI agent (sandbox) |
| `agent_status()` | Get agent status |
| `resend_claim()` | Resend claim link |

### Verification API (Fact-checking)

| Method | Description |
|--------|-------------|
| `verify_claim(claim, **options)` | Verify a claim |
| `check_source_credibility(url)` | Check source credibility |
| `batch_verify(claims)` | Batch verify claims |

## Extended Examples

### Deep Research

```python
# Multi-step research with AI synthesis
research = client.research(
    query="Impact of AI on healthcare",
    depth="deep",  # quick, standard, deep, comprehensive
    max_sources=20,
    include_analysis=True,
    focus_areas=["diagnostics", "drug discovery"]
)

print(research.executive_summary)
print(research.key_findings)
```

### Neural Search (Exa-compatible)

```python
# Semantic search with auto-prompting
results = client.neural_search(
    query="innovations in renewable energy",
    num_results=10,
    use_autoprompt=True,  # AI expands query
    include_highlights=True,
    category="tech"
)

print("Expanded queries:", results.expanded_queries)
for r in results.results:
    print(r.title, r.highlights)
```

### Topic Monitoring

```python
# Create a real-time monitor
monitor = client.create_monitor(
    topic="artificial intelligence regulations",
    keywords=["AI", "regulation", "EU"],
    check_interval_minutes=60,
    webhook_url="https://your-app.com/webhook",
    deep_analysis=True
)

# Get results
results = client.get_monitor_results(monitor["id"], limit=50)
```

### Fact Verification

```python
# Verify a claim
verification = client.verify_claim(
    claim="The Earth is approximately 4.5 billion years old",
    depth="thorough"
)

print(f"Verdict: {verification.verdict}")
print(f"Confidence: {verification.confidence}%")
print(f"Summary: {verification.summary}")

# Check source credibility
credibility = client.check_source_credibility("https://example-news.com")
print(f"Score: {credibility.credibility_score}")
print(f"Bias: {credibility.bias_rating}")
```

## Why UnSearch over Tavily?

| Feature | Tavily | UnSearch |
|---------|--------|----------|
| Open Source | ❌ | ✅ |
| Self-Hostable | ❌ | ✅ |
| Zero Retention | ❌ | ✅ |
| Cost at Scale | ~$0.0075/query | $0.0003/query |
| Free Tier | 1,000/mo | 5,000/mo |
| Search Engines | Single | 70+ |
| Neural Search | ❌ | ✅ |
| Topic Monitoring | ❌ | ✅ |
| Fact Verification | ❌ | ✅ |

## License

Apache 2.0
