Metadata-Version: 2.3
Name: komodo
Version: 3.0.4
Summary: Komodo Health's Python SDK
Requires-Dist: auth0-python>=4.7.1
Requires-Dist: pydantic-settings>=2.7.1
Requires-Dist: pyperclip>=1.9.0
Requires-Dist: questionary>=2.1.0
Requires-Dist: rich>=14.0.0
Requires-Dist: click>=8.2.1
Requires-Dist: typer>=0.16.1
Requires-Dist: httpx==0.28.1
Requires-Dist: httpx-retries>=0.4.0
Requires-Dist: pandas>=2.2.2
Requires-Dist: pyarrow<19.0.0
Requires-Dist: pydantic>=2
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: snowflake-connector-python>=3.15.0,<3.17
Requires-Dist: typing-extensions>=4.7.1
Requires-Dist: urllib3>=2.2.2
Requires-Dist: sqlalchemy>=2.0.44
Requires-Python: >=3.11, <3.14
Description-Content-Type: text/markdown

# Komodo Connector SDK

The Komodo Connector SDK is a Python library that provides programmatic access to the Komodo Health platform. It includes both a CLI for authentication management and a Python API for interacting with Komodo services and executing Snowflake queries through the Komodo platform.

## Features

- **OAuth 2.0 Device Authorization Flow**: Browser-based authentication for easy credential management
- **CLI Tools**: Simple commands for login, JWT management, and account selection
- **Snowflake Integration**: Execute SQL queries against Komodo's Snowflake data warehouse via proxy
- **Synchronous and Asynchronous Query Execution**: Support for both blocking and async query patterns
- **Automatic Token Refresh**: Credentials are automatically refreshed when they expire
- **Multi-Environment Support**: Switch between integration and production environments

## Installation

This library is designed to be used within the connector directory using `uv` for dependency management.

### Prerequisites

- Python 3.11 or higher
- `uv` package manager installed

### Setup

From within the `sdks/python/connector` directory:

```bash
# Install dependencies
uv sync

# Verify installation
uv run komodo --help
```

## Authentication

The SDK supports multiple authentication methods, with the CLI-based OAuth flow being the recommended approach for interactive use.

### CLI Authentication (Recommended)

The easiest way to get started is to use the CLI commands for authentication:

#### 1. Login

The `login` command initiates an OAuth 2.0 Device Authorization flow that opens your browser for authentication:

```bash
uv run komodo login
```

**What happens:**
1. The CLI will display a URL and automatically open it in your browser
2. Authenticate in the browser using your Komodo credentials
3. Once authenticated, your credentials (JWT access token, refresh token, and expiration) are securely saved to `~/.komodo/credentials`

Note: The environment automatically defaults to production. If you wish to use the integration environment, you can specify it:
```bash
uv run komodo login --environment integration
uv run komodo login -E integration
```

**Credentials File Structure:**

The credentials are stored in INI format at `~/.komodo/credentials`. The CLI writes to the `[default]` profile:

```ini
[default]
token = eyJ0eXAiOiJKV1QiLCJhbGc...
token_expiration = 1234567890
account_id = 123e4567-e89b-12d3-a456-426614174000
account_slug = my-organization
```

Account information is added after running `komodo account set`.

**Named Profiles:**

You can define multiple credential profiles in the credentials file to support multiple service principals. Named profiles only support service principal credentials (client_id/client_secret), not JWT tokens:

```ini
[default]
token = token_1234
token_expiration = 1234567890
account_id = 123456

[profile-1]
client_id = client_id_123456
client_secret = client_secret_123456
account_id = 123456789

[profile-2]
client_id = client_id_234567
client_secret = client_secret_234567
account_id = 234567898
```

The `[default]` profile uses lowercase keys (`token`, `token_expiration`, `account_id`, `account_slug`) and is used automatically when no profile is specified. Named profiles use lowercase keys (`client_id`, `client_secret`, `account_id`, `account_slug`) and are only for service principal authentication.

#### 2. Set Account

After logging in, you must select which Komodo account to use:

```bash
uv run komodo account set
```

**What happens:**
1. The CLI fetches all accounts you have access to from the Komodo platform
2. You'll see an interactive list of accounts with their slugs and IDs
3. Select your desired account
4. The account ID and slug are saved to your credentials file

**Note:** Make sure you're connected to Twingate (VPN) when running this command, as it needs to reach internal Komodo APIs.

You can specify the environment:

```bash
uv run komodo account set --environment integration
uv run komodo account set -E production
```

#### 3. Get Current Account

To view and copy your currently set account ID:

```bash
uv run komodo account get
```

This displays your account information and copies the account ID to your clipboard.

#### 4. Get JWT Token

To retrieve your JWT token (useful for debugging or manual API calls):

```bash
uv run komodo jwt
```

This prints your access token and copies it to your clipboard. You can specify the environment:

```bash
uv run komodo jwt --environment integration
uv run komodo jwt -E production
```

### Environment Configuration

The SDK uses the `KOMODO_ENVIRONMENT` environment variable to determine which environment to use. You can set this in your shell:

```bash
# Set environment for your session
export KOMODO_ENVIRONMENT=integration  # or "production"

# Now all SDK operations will use integration
uv run komodo jwt
uv run python your_script.py
```

**Default Behavior:**
- If `KOMODO_ENVIRONMENT` is not set, the SDK defaults to **production**
- CLI commands with `--environment` flag override the environment variable
- Available environments: `integration`, `production`

### Authentication Methods in Code

When using the SDK in your Python code, you have several options for authentication:

#### Option 1: Use Stored Credentials (Recommended)

After running `komodo login` and `komodo account set`, simply create a client:

```python
from komodo.client import Client

# Uses credentials from ~/.komodo/credentials
# Environment determined by KOMODO_ENVIRONMENT env var (defaults to production)
client = Client()
```

#### Option 2: Pass JWT and Account ID Explicitly

```python
from komodo.client import Client
from komodo.auth.Session import Session

jwt_token = "eyJ0eXAiOiJKV1QiLCJhbGc..."
account_id = "123e4567-e89b-12d3-a456-426614174000"

# Create session with explicit token
auth_session = Session(access_token=jwt_token)
client = Client(account_id=account_id, auth_session=auth_session)
```

#### Option 3: Use Environment-Specific Session

```python
from komodo.client import Client
from komodo.auth.Session import Session

# Create session for specific environment
auth_session = Session(environment="integration")
client = Client(auth_session=auth_session)
```

#### Option 4: Service Principal (Machine-to-Machine)

For automated processes, you can use service principal credentials:

```python
from komodo.auth.Session import Session
from komodo.client import Client

# Option A: From credentials file
# Add to ~/.komodo/credentials:
# [service-principal]
# komodo_client_id = your_client_id
# komodo_client_secret = your_client_secret

auth_session = Session(environment="integration")
client = Client(auth_session=auth_session)

# Option B: Pass credentials explicitly
auth_session = Session(
    client_id="client_id",
    client_secret="client_secret",
    environment="integration"
)
client = Client(auth_session=auth_session)
```

## Snowflake Integration

The SDK provides seamless integration with Snowflake through the `get_snowflake_connection()` function, which returns a DB-API 2.0 compliant connection object.

### Basic Usage

```python
from komodo.extensions.snowflake import get_snowflake_connection
import pandas as pd

# Get connection using stored credentials
conn = get_snowflake_connection()

# Get a cursor
cursor = conn.cursor()

# View the available databases
cursor.execute("SHOW DATABASES")
databases = cursor.fetchall()
print(databases)

# View the available schemas
cursor.execute("SHOW SCHEMAS")
schemas = cursor.fetchall()
print(schemas)

# Choose the database to use
cursor.execute("USE DATABASE DATA")

# Execute a query
query = "SELECT column_name, table_name FROM INFORMATION_SCHEMA.COLUMNS LIMIT 10;"
cursor.execute(query)

# Fetch results
rows = cursor.fetchall()
for row in rows:
    print(row)

# Close cursor and connection
cursor.close()
conn.close()
```

### Using with Pandas

```python
from komodo.extensions.snowflake import get_snowflake_connection
import pandas as pd

conn = get_snowflake_connection()

# View the available databases
cursor.execute("SHOW DATABASES")
databases = cursor.fetchall()
print(databases)

# View the available schemas
cursor.execute("SHOW SCHEMAS")
schemas = cursor.fetchall()
print(schemas)

# Choose the database to use
cursor.execute("USE DATABASE DATA")

# Use pandas to read query results directly
query = "SELECT * FROM INFORMATION_SCHEMA.TABLES LIMIT 100;"
df = pd.read_sql(query, conn)

print(df.head())
conn.close()
```

### Connection Configuration Options

The `get_snowflake_connection()` function accepts optional parameters:

```python
from komodo import get_snowflake_connection

# Option 1: Use stored credentials from [default] profile (after komodo login and account set)
conn = get_snowflake_connection()

# Option 2: Use a named profile from credentials file
conn = get_snowflake_connection(profile="profile-1")

# Option 3: Provide JWT and account_id explicitly
conn = get_snowflake_connection(
    jwt="eyJ0eXAiOiJKV1QiLCJhbGc...",
    account_id="123e4567-e89b-12d3-a456-426614174000"
)

# Option 4: Provide service principal credentials
conn = get_snowflake_connection(
    client_id="client_id",
    client_secret="client_secret",
    account_id="123e4567-e89b-12d3-a456-426614174000"
)
```

**Default Behavior:**
- Query tags are automatically added for tracking and debugging
- All queries are proxied through Komodo's infrastructure

### Synchronous Query Execution

Standard cursor operations follow the DB-API 2.0 specification:

```python
from komodo.extensions.snowflake import get_snowflake_connection

conn = get_snowflake_connection()
cursor = conn.cursor()

# Execute with parameters (parameterized queries)
query = "SELECT * FROM my_table WHERE date >= ? AND category = ?"
cursor.execute(query, ('2024-01-01', 'health'))

# Fetch single row
row = cursor.fetchone()
print(row)

# Fetch multiple rows
cursor.execute("SELECT * FROM my_table LIMIT 100")
rows = cursor.fetchmany(size=10)  # Fetch 10 rows
print(f"Fetched {len(rows)} rows")

# Fetch all remaining rows
all_rows = cursor.fetchall()

# Get column descriptions
print(cursor.description)

# Get row count (for DML operations)
cursor.execute("UPDATE my_table SET status = 'processed' WHERE id = 123")
print(f"Rows affected: {cursor.rowcount}")

cursor.close()
conn.close()
```

### Asynchronous Query Execution

For long-running queries, use the `execute_query_async()` function to avoid blocking:

```python
from komodo.extensions.snowflake import get_snowflake_connection, execute_query_async
import asyncio

async def run_async_query():
    # Get connection (sync operation)
    conn = get_snowflake_connection()
    cursor = conn.cursor()
    
    # Set role and database
    cursor.execute("USE DATABASE DATA")
    
    # Execute a long-running query asynchronously
    query = """
        SELECT 
            date,
            COUNT(*) as record_count,
            SUM(value) as total_value
        FROM large_table
        GROUP BY date
        ORDER BY date
    """
    
    # This returns immediately and polls for completion
    cursor = await execute_query_async(cursor=cursor, query=query)
    
    # Fetch results after query completes
    rows = cursor.fetchall()
    print(f"Query returned {len(rows)} rows")
    
    cursor.close()
    conn.close()

# Run the async query
asyncio.run(run_async_query())
```

**Async Query with Parameters:**

```python
from komodo.extensions.snowflake import get_snowflake_connection, execute_query_async
import asyncio

async def run_parameterized_async_query():
    conn = get_snowflake_connection()
    cursor = conn.cursor()
    
    cursor.execute("USE DATABASE DATA")
    
    query = "SELECT * FROM my_table WHERE date >= ? AND status = ?"
    params = ['2024-01-01', 'active']
    
    # Execute with parameters
    cursor = await execute_query_async(
        cursor=cursor,
        query=query,
        query_params=params
    )
    
    rows = cursor.fetchall()
    print(f"Found {len(rows)} matching rows")
    
    cursor.close()
    conn.close()

asyncio.run(run_parameterized_async_query())
```

**Important Notes on Async Execution:**

1. **Not True Async I/O**: As noted in the docstring, because the underlying I/O is still blocking, this is not truly asynchronous. In FastAPI applications, queries will be automatically run in a thread pool. Outside of FastAPI, consider using `asyncio.to_thread()`:

```python
import asyncio
from komodo.extensions.snowflake import get_snowflake_connection, execute_query_async

async def run_in_thread():
    conn = get_snowflake_connection()
    cursor = conn.cursor()
    cursor.execute("USE DATABASE DATA")
    
    query = "SELECT * FROM large_table"
    
    # Run in thread pool to avoid blocking the event loop
    cursor = await asyncio.to_thread(
        execute_query_async,
        cursor=cursor,
        query=query
    )
    
    rows = cursor.fetchall()
    cursor.close()
    conn.close()
    return rows
```

2. **Polling Mechanism**: The function polls Snowflake every second to check if the query has completed. This is useful for queries that take several seconds or minutes.

3. **Use Case**: Best for:
   - Long-running analytical queries
   - Queries that process large datasets
   - Situations where you want to avoid blocking the main thread

4. **Synchronous Alternative**: For short queries (< 1 second), regular `cursor.execute()` is more efficient.


## Testing

Create the .env file:
```shell
cd sdks/python/connector
just setup-env
```

Run tests:

```bash
just test

# Run only integration tests (requires valid .env file and VPN connection)
just test-integration

# Run only unit tests (do not need .env file or VPN connection)
just test-unit

# Run specific test file
uv run pytest test/test_integration_snowflake.py

# Run specific test
uv run pytest test/ -k test_client_credentials_refresh_jwt
```

## Troubleshooting

### "No access token found" Error

**Solution:** Run the login flow:
```bash
uv run komodo login
uv run komodo account set
```

### "account_id must be provided" Error

**Solution:** Make sure you've set an account:
```bash
uv run komodo account set
```

### "nodename nor servname provided" Error

**Solution:** Connect to Twingate VPN. The Komodo platform APIs are only accessible through the VPN.

### Token Expiration

The SDK automatically refreshes expired tokens. If you encounter persistent authentication issues:
```bash
# Re-authenticate
uv run komodo login
```

### Environment Issues

Make sure you're using the correct environment:
```bash
# Check current environment
echo $KOMODO_ENVIRONMENT

# Set environment explicitly
export KOMODO_ENVIRONMENT=integration

# Or use CLI flags
uv run komodo jwt --environment production
```


## API Reference

### CLI Commands

- `komodo login` - Authenticate and save credentials
- `komodo jwt [--environment ENV]` - Display and copy JWT token
- `komodo account set [--environment ENV]` - Select active account
- `komodo account get [--environment ENV]` - Display current account
- `komodo account list [--environment ENV]` - List all accounts

### Key Classes

- `Client(account_id, auth_session)` - Main SDK client for API access
- `Session(access_token, environment)` - Manages authentication state
- `get_snowflake_connection(jwt, account_id)` - Returns Snowflake connection
- `execute_query_async(cursor, query, query_params)` - Async query execution

### Environment Variables

- `KOMODO_ENVIRONMENT` - Set to `integration` or `production` (default: `production`)