Metadata-Version: 2.4
Name: core-plainid
Version: 1.0.0
Summary: Core library for integrating PlainID authorization into AI applications
Author: PlainID
License-Expression: MIT
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: <=3.12,>=3.10
Requires-Dist: azure-health-deidentification<2.0.0,>=1.0.0
Requires-Dist: azure-identity<2.0.0,>=1.25.2
Requires-Dist: httpx<1.0.0,>=0.27.0
Requires-Dist: litellm<2.0.0,>=1.81.16
Requires-Dist: presidio-analyzer<3.0.0,>=2.2.361
Requires-Dist: presidio-anonymizer<3.0.0,>=2.2.361
Requires-Dist: pydantic<3.0.0,>=2.12.5
Requires-Dist: transformers<6.0.0,>=5.2.0
Description-Content-Type: text/markdown

# core-plainid

Core library for integrating [PlainID](https://www.plainid.com/) authorization into your AI applications. Provides text anonymization, prompt categorization, SQL query authorization, and low-level PlainID API clients.

All components fully support both **synchronous** and **asynchronous** execution. The examples below use the async API; replace `await` calls with their sync counterparts (e.g. `aanonymize` → `anonymize`, `acategorize` → `categorize`, `aget_permissions` → `get_permissions`) for synchronous usage.

## Installation

```bash
pip install core-plainid
```

## Setup with PlainID

Once you have installed the library, you can set up PlainID access.

1. Retrieve your PlainID credentials to access the platform — **client ID** and **client secret**.
2. Find your PlainID base URL. For the production platform you can use `https://platform-product.us1.plainid.io`. Note the URL starts with `platform-product`.

These are the 3 parameters you need to use with the library.

> **Note:** Never share your credentials or store them in your code. Use environment variables or a secret management tool to store them securely.

## Permissions Provider

The `PlainIDPermissionsProvider` connects to PlainID and retrieves the permissions assigned to a given identity. It is used internally by the anonymizer and categorizer, but can also be used directly to retrieve categories, entities, and tools permissions.

The provider supports **multiple identities** through the `additional_identities` field in `RequestContext`. This is designed for agentic scenarios where a primary identity (e.g. a User) and an additional agentic identity (e.g. an AI Agent) are both required when resolving permissions.

```python
from core_plainid.utils.plainid_permissions_provider import PlainIDPermissionsProvider
from core_plainid.models.context.request_context import AdditionalIdentity, RequestContext

permissions_provider = PlainIDPermissionsProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

request_context = RequestContext(
    entity_id="your_entity_id",
    entity_id_type="your_entity_type",
    additional_identities=[
        AdditionalIdentity(
            entity_id="your_additional_entity_id",
            entity_type_id="your_additional_entity_type",
        ),
    ],
)

permissions = await permissions_provider.aget_permissions(request_context)

print(permissions.categories)  # allowed category names
print(permissions.entities)    # anonymization entity actions
print(permissions.tools)       # allowed tool names
```

You can also provide `request_context` at construction time to eagerly load permissions once and reuse them across multiple calls without passing the context each time:

```python
permissions_provider = PlainIDPermissionsProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
    request_context=RequestContext(
        entity_id="your_entity_id",
        entity_id_type="your_entity_type",
    ),
)

permissions = permissions_provider.get_permissions()
```

## Category Filtering

The `Categorizer` classifies user prompts into categories and verifies that the classified categories are allowed by PlainID policies. If the prompt's categories are not permitted, a `PlainIDCategorizerException` is raised. Multiple identities are supported through `RequestContext`.

### PlainID Setup

To use category filtering, configure a ruleset in PlainID using the `Prompt_Control` template and set up the available categories as assets (e.g. `contract`, `HR`, `finance`):

```
# METADATA
# custom:
#   plainid:
#     kind: Ruleset
#     name: All
ruleset(asset, identity, requestParams, action) if {
    asset.template == "Prompt_Control"
}
```

### Usage

```python
from core_plainid.categorization.categorizer import Categorizer
from core_plainid.utils.plainid_permissions_provider import PlainIDPermissionsProvider
from core_plainid.models.context.request_context import RequestContext

permissions_provider = PlainIDPermissionsProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

categorizer = Categorizer(
    classifier_provider=classifier,  # see Category Classifiers below
    permissions_provider=permissions_provider,
    all_categories=["contract", "HR", "finance"],
)

request_context = RequestContext(
    entity_id="your_entity_id",
    entity_id_type="your_entity_type",
)

result = await categorizer.acategorize(
    "I'd like to know the weather forecast for today",
    request_context=request_context,
)
```

The `all_categories` parameter specifies the full list of possible categories for classification. The categorizer classifies the prompt using the provided classifier, then verifies the result against the categories allowed by PlainID. If the classified categories are not a subset of the allowed categories, a `PlainIDCategorizerException` is raised.

### Category Classifiers

Two built-in classifiers are available:

#### LLMCategoryClassifierProvider

Uses an LLM to classify prompts. Model calls are powered by [LiteLLM](https://docs.litellm.ai/docs/providers), which supports 100+ LLM providers (OpenAI, Anthropic, Azure, Bedrock, Ollama, etc.) through a unified interface.

Set the appropriate environment variables for your chosen provider:

```bash
# OpenAI
export OPENAI_API_KEY="your_openai_api_key"

# Anthropic
export ANTHROPIC_API_KEY="your_anthropic_api_key"

# Azure OpenAI
export AZURE_API_KEY="your_azure_api_key"
export AZURE_API_BASE="https://your-resource.openai.azure.com"
export AZURE_API_VERSION="2024-02-01"
```

For the full list of supported providers and their environment variables, see the [LiteLLM Providers documentation](https://docs.litellm.ai/docs/providers).

```python
from core_plainid.categorization.llm_category_classifier_provider import LLMCategoryClassifierProvider

llm_classifier = LLMCategoryClassifierProvider(model="openai/gpt-4o")
```

The `model` parameter follows LiteLLM's `provider/model` naming convention (e.g. `openai/gpt-4o`, `anthropic/claude-sonnet-4-20250514`, `ollama/llama3`).

Use with caution — classification quality depends on the LLM you choose. Some base models may return poor or incorrect results, so prefer larger models (OpenAI, Anthropic, etc.) or models trained for classification tasks.

#### ZeroShotCategoryClassifierProvider

Uses a Hugging Face zero-shot classification model. The model is downloaded automatically on first use.

```python
from core_plainid.categorization.zeroshot_category_classifier_provider import ZeroShotCategoryClassifierProvider

zeroshot_classifier = ZeroShotCategoryClassifierProvider(
    model_name="facebook/bart-large-mnli",
    threshold=0.5,
)
```

The `threshold` parameter (default `0.5`) controls the minimum confidence score for a category to be included. Use this classifier if you want better classification results without relying on an external LLM API, but note it requires disk space for the downloaded model.

## Anonymization

The `PresidioAnonymizer` detects and anonymizes PII (Personally Identifiable Information) in text using [Microsoft Presidio](https://microsoft.github.io/presidio/). It supports two actions: **MASK** (replaces PII with `***`) and **ENCRYPT** (encrypts the detected PII using a provided key). Multiple identities are supported through `RequestContext`.

The list of supported PII entities is based on [Presidio's supported entities](https://microsoft.github.io/presidio/supported_entities/).

### PlainID Setup

To use anonymization, configure rulesets in PlainID using the `Output_Control` template. Each entity type you want to detect needs its own ruleset:

```
# METADATA
# custom:
#   plainid:
#     kind: Ruleset
#     name: PERSON
ruleset(asset, identity, requestParams, action) if {
    asset.template == "Output_Control"
    asset["path"] == "PERSON"
    action.id in ["MASK"]
}
```

#### Custom Regex Entities

You can define custom entities using regex patterns. Configure them in PlainID with a `REGEX` path prefix and a `regexValue` attribute containing the pattern:

```
# METADATA
# custom:
#   plainid:
#     kind: Ruleset
#     name: REGEX_EMAIL
ruleset(asset, identity, requestParams, action) if {
    asset.template == "Output_Control"
    asset["path"] == "REGEX_EMAIL"
    asset["regexValue"] == "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}"
    action.id in ["MASK"]
}
```

#### Azure Health De-identification (AHDS) Entities

For medical/health data, you can enable Azure Health Data Services de-identification to detect health-specific PHI entities (e.g. `DOCTOR`, `PATIENT`, `AGE`). Configure them in PlainID the same way:

```
# METADATA
# custom:
#   plainid:
#     kind: Ruleset
#     name: DOCTOR
ruleset(asset, identity, requestParams, action) if {
    asset.template == "Output_Control"
    asset["path"] == "DOCTOR"
    action.id in ["MASK"]
}
```

### Usage

```python
from core_plainid.anonymization.presidio_anonymizer import PresidioAnonymizer
from core_plainid.utils.plainid_permissions_provider import PlainIDPermissionsProvider
from core_plainid.models.context.request_context import RequestContext

permissions_provider = PlainIDPermissionsProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

anonymizer = PresidioAnonymizer(
    permissions_provider=permissions_provider,
    encrypt_key="your_16_char_key!",
)

request_context = RequestContext(
    entity_id="your_entity_id",
    entity_id_type="your_entity_type",
)

result = await anonymizer.aanonymize(
    "John Smith lives in New York",
    request_context=request_context,
)
print(result)  # "*** lives in ***"
```

The `encrypt_key` parameter is optional and only required if you use the `ENCRYPT` action. The key is used for AES encryption and must be 128, 192, or 256 bits long (16, 24, or 32 characters).

### Enabling Azure Health De-identification (AHDS)

To detect health-specific PHI entities, enable AHDS by passing `enable_ahds=True` and setting the following environment variables:

```bash
export AHDS_ENDPOINT="https://your-deid-service.api.deid.azure.com"
export AZURE_TENANT_ID="your_azure_tenant_id"
export AZURE_CLIENT_ID="your_azure_client_id"
export AZURE_CLIENT_SECRET="your_azure_client_secret"
```

```python
anonymizer = PresidioAnonymizer(
    permissions_provider=permissions_provider,
    encrypt_key="your_16_char_key!",
    enable_ahds=True,
)
```

## Tools Authorization

The `PlainIDPermissionsProvider` can retrieve the list of tools a user is authorized to use. This is useful for filtering available tools in your AI agent based on PlainID policies. Multiple identities are supported through `RequestContext`.

### PlainID Setup

Configure a ruleset in PlainID using the `Tools` template and set up the available tools as assets (e.g. `search_tool`, `calculator`, `email_sender`):

```
# METADATA
# custom:
#   plainid:
#     kind: Ruleset
#     name: All
ruleset(asset, identity, requestParams, action) if {
    asset.template == "Tools"
}
```

### Usage

```python
from core_plainid.utils.plainid_permissions_provider import PlainIDPermissionsProvider
from core_plainid.models.context.request_context import RequestContext

permissions_provider = PlainIDPermissionsProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

request_context = RequestContext(
    entity_id="your_entity_id",
    entity_id_type="your_entity_type",
)

permissions = await permissions_provider.aget_permissions(request_context)
allowed_tools = permissions.tools  # e.g. ["search_tool", "calculator"]
```

## SQL Database Authorizer

The `PlainIDSQLAuthorizerClient` dynamically modifies SQL queries based on PlainID authorization policies, enforcing Row-Level Security (RLS) and Column-Level Security (CLS) at query time.

> **Note:** The SQL Authorizer does not currently support multiple identities. Only a single identity context (`entity_id` / `entity_type_id`) can be provided per request.

### Authentication

The client supports two authentication modes:

- **Client credentials** — provide `client_id` and `client_secret` at construction time.
- **JWT token** — provide an `auth_token` per request. The `Bearer` prefix is added automatically if missing.

### Usage

```python
from core_plainid.clients.plainid_sql_authorizer_client import PlainIDSQLAuthorizerClient
from core_plainid.models.request.sql_authorizer_request import (
    SQLAuthorizerRequest,
    SQLAuthorizerFlags,
    PoliciesJoinOperation,
)

sql_authorizer = PlainIDSQLAuthorizerClient(
    base_url="https://your-sql-authz.plainid.cloud",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

request = SQLAuthorizerRequest(
    sql="SELECT * FROM accounts WHERE country = 'US'",
    entity_id="your_entity_id",
    entity_type_id="your_entity_type",
    flags=SQLAuthorizerFlags(
        empty_rls_treat_as_denied=True,
        empty_cls_treat_as_permitted=True,
        expand_star_column=True,
        policies_join_operation=PoliciesJoinOperation.OR,
    ),
)

response = sql_authorizer.authorize_sql(request)
print(response.sql)           # the modified SQL query
print(response.was_modified)  # True if policies were applied
```

Using a JWT token instead of client secret:

```python
sql_authorizer = PlainIDSQLAuthorizerClient(
    base_url="https://your-sql-authz.plainid.cloud",
    client_id="your_client_id",
)

response = sql_authorizer.authorize_sql(request, auth_token="your_jwt_token")
```

## PlainID Auth Client

The `PlainIDAuthClient` is the low-level client used internally by `PlainIDPermissionsProvider` to communicate with the PlainID API. It provides `get_token` / `aget_token` for retrieving user access tokens and `get_resolution` / `aget_resolution` for fetching resolution data. In most cases you should use `PlainIDPermissionsProvider` instead of interacting with this client directly.

```python
from core_plainid.clients.plainid_auth_client import PlainIDAuthClient

client = PlainIDAuthClient(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

token_response = await client.aget_token(
    entity_id="your_entity_id",
    entity_type_id="your_entity_type",
)
```

## Exceptions

The library provides specific exceptions to help identify which component caused an error. All exceptions inherit from `PlainIDException` and include the error `message` and the `original_exception` (if applicable).

| Exception | Description |
|---|---|
| `PlainIDException` | Base exception for all PlainID errors |
| `PlainIDAuthClientException` | Errors in the PlainID Auth client |
| `PlainIDPermissionsException` | Errors in permissions processing |
| `PlainIDCategorizerException` | Errors in the categorizer component |
| `PlainIDAnonymizerException` | Errors in the anonymizer component |
| `PlainIDSQLAuthorizerClientException` | Errors in the SQL Authorizer client |
| `PlainIDFilterException` | Errors in filter processing |
| `PlainIDRetrieverException` | Errors in the retriever components |
| `LlmResponseException` | Malformed or unexpected LLM responses |

```python
from core_plainid.exceptions.plainid_exceptions import PlainIDAnonymizerException

try:
    result = await anonymizer.aanonymize(query, request_context=request_context)
except PlainIDAnonymizerException as e:
    print(f"Anonymization error: {e.message}")
```
