Metadata-Version: 2.4
Name: opendatalabs-query-sdk
Version: 0.1.1
Summary: A Python SDK for interacting with the Query API service
Project-URL: Homepage, https://github.com/vana-com/query-sdk-python
Project-URL: Documentation, https://github.com/vana-com/query-sdk-python#readme
Project-URL: Issues, https://github.com/vana-com/query-sdk-python/issues
Author-email: OpenDataLabs <alex@opendatalabs.xyz>
License-Expression: MIT
Keywords: api,query,sdk
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.7
Requires-Dist: requests>=2.28.0
Requires-Dist: typing-extensions>=4.0.0; python_version < '3.8'
Provides-Extra: dev
Requires-Dist: black>=22.0.0; extra == 'dev'
Requires-Dist: build>=0.10.0; extra == 'dev'
Requires-Dist: isort>=5.0.0; extra == 'dev'
Requires-Dist: mypy>=0.990; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.0.290; extra == 'dev'
Requires-Dist: twine>=4.0.2; extra == 'dev'
Description-Content-Type: text/markdown

# Query API SDK

A Python SDK for interacting with Vana's Query API service. This SDK allows you to easily query and transform your AI-generated data, including posts, tweets, and other content generated by your AI models.

## Features

- 🐍 Full Python type hints
- 🔒 Built-in authentication
- 🔄 Async-style API
- 📊 Data transformation support
- 🔔 Webhook integration
- 📝 Comprehensive typing
- ⏱️ Polling utilities for long-running queries

## Installation

```bash
pip install query-api-sdk
# or
poetry add query-api-sdk
```

## Quick Start

```python
from query_api_sdk import create_client, QueryClientConfig

client = create_client(QueryClientConfig(
    api_key='your-api-key',
    base_url='https://api.vana.ai/query'
))

# Submit a query and get results
def get_my_posts():
    try:
        query_id = client.submit_query({
            "query": "SELECT * FROM reddit_posts WHERE file_owner = 'my_user_id'"
        })
        
        results = client.wait_for_results(query_id)
        print(results)
    except Exception as error:
        print(f"Error: {str(error)}")
```

## Available Data Schemas

The Query API provides access to various Vana-generated content types:

### Reddit Posts
```sql
reddit_posts {
  file_owner: string     -- User ID of the post owner
  post_id: integer       -- Unique identifier for the post
  title: string         -- Post title
  content: string       -- Post content
}
```

### Twitter Tweets
```sql
twitter_tweets {
  file_owner: string    -- User ID of the tweet owner
  tweet_id: integer     -- Unique identifier for the tweet
  text: string         -- Tweet content
}
```

## Detailed Usage

### Configuration

```python
from query_api_sdk import create_client, QueryClientConfig

client = create_client(QueryClientConfig(
    api_key='your-api-key',
    base_url='https://api.vana.ai/query',
    timeout=30000  # Optional: default is 10000ms
))
```

### Getting Available Schemas

```python
schemas = client.get_schemas()
print(schemas)
```

### Submitting Queries

Basic query:
```python
query_id = client.submit_query({
    "query": "SELECT * FROM reddit_posts LIMIT 10"
})
```

With data transformation:
```python
query_id = client.submit_query({
    "query": "SELECT * FROM reddit_posts",
    "transform": """
    def transform(rows):
        return [{**row, "word_count": len(row["content"].split())} for row in rows]
    """
})
```

With webhook notification:
```python
query_id = client.submit_query({
    "query": "SELECT * FROM twitter_tweets",
    "webhook_url": "https://your-server.com/webhook"
})
```

### Checking Query Status

```python
status = client.get_query_status(query_id)
print(status["status"])  # 'queued' | 'processing' | 'completed' | 'failed'
```

### Getting Results

With pagination:
```python
results = client.get_query_results(
    query_id,
    limit=100,
    cursor="200"
)
```

Waiting for completion:
```python
results = client.wait_for_results(
    query_id,
    timeout=300000,      # Optional: max time to wait (default: 5 minutes)
    poll_interval=1000   # Optional: time between status checks (default: 1 second)
)
```

## Common Query Examples

### Getting Recent Posts

```python
query_id = client.submit_query({
    "query": """
        SELECT *
        FROM reddit_posts
        WHERE file_owner = 'your_user_id'
        ORDER BY post_id DESC
        LIMIT 10
    """
})
```

### Analyzing Content Length

```python
query_id = client.submit_query({
    "query": """
        SELECT *
        FROM reddit_posts
        WHERE file_owner = 'your_user_id'
    """,
    "transform": """
    def transform(rows):
        return [{
            **row,
            "content_length": len(row["content"]),
            "word_count": len(row["content"].split())
        } for row in rows]
    """
})
```

### Combining Data Sources

```python
query_id = client.submit_query({
    "query": """
        SELECT 
            'reddit' as source,
            title as content,
            post_id as id
        FROM reddit_posts
        WHERE file_owner = 'your_user_id'
        UNION ALL
        SELECT 
            'twitter' as source,
            text as content,
            tweet_id as id
        FROM twitter_tweets
        WHERE file_owner = 'your_user_id'
    """
})
```

## Error Handling

The SDK uses a custom `QueryAPIError` class for error handling:

```python
from query_api_sdk import QueryAPIError

try:
    results = client.get_query_results("invalid-id")
except QueryAPIError as error:
    print(f"API Error: {str(error)}")
    print(f"Status Code: {error.status_code}")
    print(f"Response: {error.response}")
```

## Webhook Integration

When providing a webhook URL, your endpoint will receive POST requests with the following format:

```python
{
    "query_id": str,
    "status": str,  # 'completed' | 'failed'
    "error": Optional[str]
}
```

Example webhook handler (Flask):
```python
from flask import Flask, request

app = Flask(__name__)

@app.route('/webhook', methods=['POST'])
def webhook():
    data = request.json
    query_id = data["query_id"]
    status = data["status"]
    error = data.get("error")
    
    if status == "completed":
        # Handle completion
        pass
    elif status == "failed":
        # Handle failure
        pass
    
    return "", 200
```

## Rate Limiting

The Query API implements rate limiting to ensure fair usage. The SDK will automatically handle rate limit responses by raising a `QueryAPIError` with the appropriate status code and message.

## Type Hints Support

The SDK is written with full Python type hints and provides comprehensive type definitions for all features. You can import types directly:

```python
from query_api_sdk import (
    QueryStatusType,
    Schema,
    QueryRequest,
    QueryResults
)
```

## Development

For development, clone the repository and install dependencies:

```bash
git clone https://github.com/vana-com/query-sdk-python.git
cd query-sdk-python
pip install -e ".[dev]"
```

Run tests:
```bash
pytest
```

## License

MIT License - see [LICENSE](LICENSE) for details.