Metadata-Version: 2.4
Name: genosis
Version: 1.0.1
Summary: Official Python SDK for the Genosis LLM cost optimization platform
Project-URL: Homepage, https://usegenosis.ai
Project-URL: Documentation, https://usegenosis.ai/docs
Project-URL: Repository, https://github.com/Genosis-Limited/genosis-python
License:                                  Apache License
                                   Version 2.0, January 2004
                                http://www.apache.org/licenses/
        
           TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
        
           1. Definitions.
        
              "License" shall mean the terms and conditions for use, reproduction,
              and distribution as defined by Sections 1 through 9 of this document.
        
              "Licensor" shall mean the copyright owner or entity authorized by
              the copyright owner that is granting the License.
        
              "Legal Entity" shall mean the union of the acting entity and all
              other entities that control, are controlled by, or are under common
              control with that entity. For the purposes of this definition,
              "control" means (i) the power, direct or indirect, to cause the
              direction or management of such entity, whether by contract or
              otherwise, or (ii) ownership of fifty percent (50%) or more of the
              outstanding shares, or (iii) beneficial ownership of such entity.
        
              "You" (or "Your") shall mean an individual or Legal Entity
              exercising permissions granted by this License.
        
              "Source" form shall mean the preferred form for making modifications,
              including but not limited to software source code, documentation
              source, and configuration files.
        
              "Object" form shall mean any form resulting from mechanical
              transformation or translation of a Source form, including but
              not limited to compiled object code, generated documentation,
              and conversions to other media types.
        
              "Work" shall mean the work of authorship made available under
              the License, as indicated by a copyright notice that is included in
              or attached to the work (an example is provided in the Appendix below).
        
              "Derivative Works" shall mean any work, whether in Source or Object
              form, that is based on (or derived from) the Work and for which the
              editorial revisions, annotations, elaborations, or other transformations
              represent, as a whole, an original work of authorship. For the purposes
              of this License, Derivative Works shall not include works that remain
              separable from, or merely link (or bind by name) to the interfaces of,
              the Work and its Derivative Works thereof.
        
              "Contribution" shall mean, as submitted to the Licensor for inclusion
              in the Work by You, whether in Source form or as an Object form.
        
              "Contributor" shall mean Licensor and any Legal Entity on behalf of
              whom a Contribution has been received by the Licensor and included
              within the Work.
        
           2. Grant of Copyright License. Subject to the terms and conditions of
              this License, each Contributor hereby grants to You a perpetual,
              worldwide, non-exclusive, no-charge, royalty-free, irrevocable
              copyright license to reproduce, prepare Derivative Works of,
              publicly display, publicly perform, sublicense, and distribute the
              Work and such Derivative Works in Source or Object form.
        
           3. Grant of Patent License. Subject to the terms and conditions of
              this License, each Contributor hereby grants to You a perpetual,
              worldwide, non-exclusive, no-charge, royalty-free, irrevocable
              (except as stated in this section) patent license to make, have made,
              use, offer to sell, sell, import, and otherwise transfer the Work,
              where such license applies only to those patent claims licensable
              by such Contributor that are necessarily infringed by their
              Contribution(s) alone or by the combination of their Contribution(s)
              with the Work to which such Contribution(s) was submitted. If You
              institute patent litigation against any entity (including a cross-claim
              or counterclaim in a lawsuit) alleging that the Work or any
              Contribution embodied within the Work constitutes direct or
              contributory patent infringement, then any patent licenses granted to
              You under this License for that Work shall terminate as of the date
              such litigation is filed.
        
           4. Redistribution. You may reproduce and distribute copies of the Work
              or Derivative Works thereof in any medium, with or without
              modifications, and in Source or Object form, provided that You meet
              the following conditions:
        
              (a) You must give any other recipients of the Work or Derivative Works
                  a copy of this License; and
        
              (b) You must cause any modified files to carry prominent notices
                  stating that You changed the files; and
        
              (c) You must retain, in the Source form of any Derivative Works that
                  You distribute, all copyright, patent, trademark, and attribution
                  notices from the Source form of the Work, excluding those notices
                  that do not pertain to any part of the Derivative Works; and
        
              (d) If the Work includes a "NOTICE" text file as part of its
                  distribution, You must include a readable copy of the attribution
                  notices contained within such NOTICE file, in at least one of the
                  following places: within a NOTICE text file distributed as part of
                  the Derivative Works; within the Source form or documentation, if
                  provided along with the Derivative Works; or, within a display
                  generated by the Derivative Works, if and wherever such third-party
                  notices normally appear. The contents of the NOTICE file are for
                  informational purposes only and do not modify the License. You may
                  add Your own attribution notices within Derivative Works that You
                  distribute, alongside or as an addendum to the NOTICE text from the
                  Work, provided that such additional attribution notices cannot be
                  construed as modifying the License.
        
              You may add Your own license statement for Your modifications and may
              provide additional grant of rights to use, copy, modify, merge, publish,
              distribute, sublicense, and/or sell copies of the Work, and to permit
              persons to whom the Work is furnished to do so.
        
           5. Submission of Contributions. Unless You explicitly state otherwise,
              any Contribution intentionally submitted for inclusion in the Work by
              You to the Licensor shall be under the terms and conditions of this
              License, without any additional terms or conditions.
        
           6. Trademarks. This License does not grant permission to use the trade
              names, trademarks, service marks, or product names of the Licensor,
              except as required for reasonable and customary use in describing the
              origin of the Work.
        
           7. Disclaimer of Warranty. Unless required by applicable law or agreed
              to in writing, Licensor provides the Work (and each Contributor
              provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES
              OR CONDITIONS OF ANY KIND, either express or implied, including,
              without limitation, any warranties or conditions of TITLE,
              NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR
              PURPOSE. You are solely responsible for determining the appropriateness
              of using or reproducing the Work and assume any risks associated with
              Your exercise of permissions under this License.
        
           8. Limitation of Liability. In no event and under no legal theory,
              whether in tort (including negligence), contract, or otherwise,
              unless required by applicable law (such as deliberate and grossly
              negligent acts) or agreed to in writing, shall any Contributor be
              liable to You for damages, including any direct, indirect, special,
              incidental, or exemplary damages of any character arising as a result
              of this License or out of the use or inability to use the Work.
        
           9. Accepting Warranty or Liability. While redistributing the Work or
              Derivative Works thereof, You may offer, and charge a fee for,
              acceptance of support, warranty, indemnity, or other liability
              obligations and/or rights consistent with this License. However, in
              accepting such obligations, You may offer such conditions only on Your
              own behalf and on Your sole responsibility.
        
           END OF TERMS AND CONDITIONS
        
           Copyright 2026 Genosis Limited
        
           Licensed under the Apache License, Version 2.0 (the "License");
           you may not use this file except in compliance with the License.
           You may obtain a copy of the License at
        
               http://www.apache.org/licenses/LICENSE-2.0
        
           Unless required by applicable law or agreed to in writing, software
           distributed under the License is distributed on an "AS IS" BASIS,
           WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
           See the License for the specific language governing permissions and
           limitations under the License.
License-File: LICENSE
License-File: NOTICE
Keywords: ai,cache,genosis,llm,optimization
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27.0
Description-Content-Type: text/markdown

# genosis

Genosis reduces LLM inference costs by up to 75% through server-optimized prompt caching. The SDK wraps your existing API calls with one method — `g.call()` — and applies optimization transparently.

```python
import anthropic
from genosis import Genosis

client = anthropic.Anthropic()
g = Genosis(api_key="gns_live_...")

result = g.call(
    {
        "model": "claude-sonnet-4-6",
        "system": [
            {"type": "text", "text": system_context},
            {"type": "text", "text": product_catalog},
        ],
        "messages": [{"role": "user", "content": question}],
        "max_tokens": 1024,
    },
    lambda params: client.messages.create(**params),
)

print(result.response)   # the Anthropic response object, unmodified
print(result.memoized)   # True if served from local cache
```

No schema changes. No new concepts. Your existing LLM code stays intact.

## Installation

The package will be published to PyPI. Until then, install directly from GitHub:

```bash
pip install git+https://github.com/Genosis-Limited/genosis-python.git
```

Requires Python 3.10+. The only runtime dependency is `httpx`.

Once published to PyPI:

```bash
pip install genosis
```

## Provider Examples

### Anthropic

```python
import anthropic
from genosis import Genosis

client = anthropic.Anthropic()
g = Genosis(api_key="gns_live_...")

result = g.call(
    {
        "model": "claude-sonnet-4-6",
        "system": [
            {"type": "text", "text": system_context},
            {"type": "text", "text": product_catalog},
        ],
        "messages": [{"role": "user", "content": question}],
        "max_tokens": 512,
    },
    lambda params: client.messages.create(**params),
)
```

Genosis adds `cache_control` breakpoints to your system blocks automatically. You do not need to add them yourself — any existing breakpoints you placed are replaced with the server-optimized set.

### OpenAI

```python
import openai
from genosis import Genosis

client = openai.OpenAI()
g = Genosis(api_key="gns_live_...")

result = g.call(
    {
        "model": "gpt-4o",
        "messages": [
            {
                "role": "system",
                "content": [
                    {"type": "text", "text": system_context},
                    {"type": "text", "text": product_catalog},
                ],
            },
            {"role": "user", "content": question},
        ],
        "max_tokens": 512,
    },
    lambda params: client.chat.completions.create(**params),
)
```

For OpenAI, Genosis reorders system content blocks to maximize prefix cache hits. No `cache_control` markers — OpenAI's prompt caching is automatic.

### AWS Bedrock

```python
import boto3
import json
from genosis import Genosis

bedrock = boto3.client("bedrock-runtime", region_name="us-east-1")
g = Genosis(api_key="gns_live_...")

# Bedrock ARNs are normalized automatically — the manifest lookup uses
# the canonical model name (e.g., claude-sonnet-4-6-20250514)
result = g.call(
    {
        "model": "anthropic.claude-sonnet-4-6-20250514-v1:0",
        "system": system_prompt,
        "messages": [{"role": "user", "content": question}],
        "max_tokens": 512,
        "anthropic_version": "bedrock-2023-05-31",
    },
    lambda params: json.loads(
        bedrock.invoke_model(
            modelId=params["model"],
            body=json.dumps(params),
        )["body"].read()
    ),
)
```

Cross-region inference ARNs (`us.anthropic.claude-*`) are also handled.

### Azure OpenAI

```python
import openai
from genosis import Genosis

azure = openai.AzureOpenAI(
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    api_version="2024-02-01",
)
g = Genosis(api_key="gns_live_...")

result = g.call(
    {
        "model": "gpt-4o",
        "messages": [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": question},
        ],
        "max_tokens": 512,
    },
    lambda params: azure.chat.completions.create(**params),
)
```

## Supported Providers and Models

| Provider | Models | Also works via |
|----------|--------|----------------|
| Anthropic | claude-opus-4, claude-sonnet-4-6, claude-haiku-4-5 | AWS Bedrock |
| OpenAI | gpt-4.1, gpt-4.1-mini, gpt-4o, gpt-4o-mini, o1, o3, o4-mini | Azure OpenAI |
| Google *(coming soon)* | gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite | Vertex AI |

Provider detection is automatic from the model name. Bedrock ARNs (`anthropic.claude-*`, `us.anthropic.claude-*`) are recognized and normalized to canonical model IDs for manifest lookup.

## How It Works

On each `g.call()`:

1. The SDK detects your provider from the model field.
2. It checks whether a server-optimized manifest exists for that provider/model. On the first call, the manifest is fetched in the background — your call goes through normally.
3. When a manifest is available, the SDK applies it: for Anthropic, it inserts `cache_control` breakpoints on high-value system blocks; for OpenAI, it reorders system content blocks to maximize prefix cache hits.
4. Your function is called with the (possibly modified) params.
5. Usage data is hashed and queued for telemetry. The background worker flushes it to `api.usegenosis.ai` — no synchronous network call on the hot path.
6. Manifests refresh every 5 minutes in the background. A stale manifest is better than no optimization.

If anything in the Genosis layer throws, your function is called with the original unmodified params and the error is reported silently. `g.call()` cannot break your LLM calls.

## Configuration

```python
from genosis import Genosis

g = Genosis(
    # Required
    api_key="gns_live_...",            # or "gns_test_..." for test keys

    # Optional — shown with defaults
    base_url="https://api.usegenosis.ai",
    max_retries=2,                     # retries on 429/5xx (exponential backoff)
    timeout=60.0,                      # per-request timeout in seconds
    manifest_refresh_interval=300,     # seconds between manifest refreshes; 0 = disabled
    memoization_enabled=True,          # see Memoization below
    memoization_max_entries=1000,      # max entries in the in-process LRU cache
    memo_storage=None,                 # plug in Redis, etc. (see Memoization below)
    buffer_path=None,                  # SQLite buffer path; default: ~/.genosis/buffer_<prefix>.db
    buffer_max_size=10_000,            # max buffered events before oldest are dropped
)
```

## Memoization

Memoization serves identical requests from a local cache without calling the LLM at all. The server identifies which request patterns are worth memoizing based on your telemetry — the SDK just applies the decision.

When a memoized response is served, `result.memoized is True` and no LLM call is made.

The default storage is an in-process LRU map. For multi-process deployments (multiple workers, serverless), plug in a shared store:

```python
from genosis import Genosis, MemoStorage
from typing import Any, Optional

class RedisMemoStorage(MemoStorage):
    def __init__(self, redis_client):
        self._redis = redis_client

    def get(self, fingerprint: str) -> Optional[Any]:
        import json
        val = self._redis.get(f"genosis:memo:{fingerprint}")
        return json.loads(val) if val else None

    def set(self, fingerprint: str, response: Any, ttl_seconds: int) -> None:
        import json
        self._redis.setex(
            f"genosis:memo:{fingerprint}",
            ttl_seconds,
            json.dumps(response),
        )

g = Genosis(api_key="gns_live_...", memo_storage=RedisMemoStorage(redis_client))
```

`MemoStorage` is a two-method abstract base class — any implementation works.

To disable memoization entirely:

```python
g = Genosis(api_key="gns_live_...", memoization_enabled=False)
```

## Serverless and Batch Jobs

The background worker flushes telemetry continuously in long-running processes. In serverless functions or batch jobs that exit after each invocation, call `flush()` before the process ends:

```python
# At the end of your handler / job
remaining = g.flush(timeout=30.0)  # wait up to 30s for buffer to drain
```

`flush()` returns the number of events still in the buffer when the timeout is reached. A return value of `0` means the buffer was fully drained.

Example AWS Lambda handler:

```python
import atexit
from genosis import Genosis

g = Genosis(api_key="gns_live_...")

def handler(event, context):
    result = g.call(params, lambda p: client.messages.create(**p))
    # ... process result ...
    g.flush(timeout=10.0)
    return response
```

## Background Worker

The worker starts automatically when you construct `Genosis`. It handles:

- Telemetry batching and upload
- Manifest acknowledgement
- Error reporting

Telemetry is written to a local SQLite file first (`~/.genosis/buffer_<keyprefix>.db`). If the network is unavailable, events are held in the buffer and retried on the next worker cycle. Nothing is lost on transient failures.

Each API key prefix gets its own buffer file, so multiple apps on the same machine do not share state.

The worker thread is a daemon — it does not prevent the Python process from exiting. On clean process exit, `atexit` and `SIGTERM` handlers flush remaining events automatically. For hard kills or serverless functions, call `g.flush()` explicitly.

## Content-Blind Security Model

Genosis never sees your prompts, responses, user data, or API keys.

What leaves the SDK:

- SHA-256 hashes of content blocks (one-way, irreversible)
- Token counts
- Usage numbers from the LLM response (`input_tokens`, `output_tokens`, `cache_read_input_tokens`, etc.)
- Provider and model name

What stays local:

- All prompt text
- All LLM responses
- The memoization cache

The hashing is done in the SDK before any network call. You can verify this in [`genosis/client.py`](./genosis/client.py) — search for `sha256`. Error messages are also sanitized before logging: API keys and long base64 strings are redacted automatically.

## Error Handling

Errors from the management API (`g.account`, `g.manifest`, etc.) raise typed exceptions:

```python
from genosis import (
    GenosisError,
    AuthenticationError,
    RateLimitError,
    NotFoundError,
    ConnectionError,
    TimeoutError,
)

try:
    g.optimization.trigger("anthropic", "claude-sonnet-4-6")
except AuthenticationError:
    pass  # Invalid or revoked API key (HTTP 401)
except RateLimitError:
    pass  # Too many requests (HTTP 429) — back off
except ConnectionError:
    pass  # Network failure — no HTTP response received
except TimeoutError:
    pass  # Request exceeded the configured timeout
except GenosisError as e:
    print(e.status, e.code, str(e))
```

All typed errors extend `GenosisError` and expose `status` (HTTP status code) and `code` (machine-readable string).

**`g.call()` does not raise Genosis errors.** If the optimization layer fails for any reason, `fn` is called with the original unmodified params. LLM errors (rate limits, network failures, etc.) propagate normally — Genosis does not swallow them.

### Error classes

| Class | Status | Default code |
|-------|--------|--------------|
| `BadRequestError` | 400 | `BAD_REQUEST` |
| `AuthenticationError` | 401 | `UNAUTHORIZED` |
| `PermissionDeniedError` | 403 | `FORBIDDEN` |
| `NotFoundError` | 404 | `NOT_FOUND` |
| `ConflictError` | 409 | `CONFLICT` |
| `UnprocessableEntityError` | 422 | `UNPROCESSABLE_ENTITY` |
| `RateLimitError` | 429 | `RATE_LIMITED` |
| `InternalServerError` | 500+ | `INTERNAL_SERVER_ERROR` |
| `ConnectionError` | — | `CONNECTION_ERROR` |
| `TimeoutError` | — | `TIMEOUT` |

## Management API

Use these for dashboards, scripts, and setup tooling — not in the hot path.

```python
# Account
account = g.account.get()
usage = g.account.get_usage()
keys = g.account.list_api_keys()
new_key = g.account.create_api_key("worker-prod", ["ingest", "manifest:read"])
g.account.revoke_api_key(key_id)

# Manifests
manifest_resp = g.manifest.get("anthropic", "claude-sonnet-4-6")  # manifest_resp["data"]
all_manifests = g.manifest.list_all()
history = g.manifest.get_history("anthropic", "claude-sonnet-4-6")

# Optimization (runs server-side)
run = g.optimization.trigger("anthropic", "claude-sonnet-4-6")
status = g.optimization.get_status("anthropic", "claude-sonnet-4-6")
results = g.optimization.get_results("anthropic", "claude-sonnet-4-6")

# Telemetry
summary = g.telemetry.get_summary(days=7)
costs = g.telemetry.get_cost_breakdown(days=30, provider="anthropic")
blocks = g.telemetry.get_block_frequencies(days=7, provider="anthropic", model="claude-sonnet-4-6")
```

## Types

All public types are `TypedDict` subclasses from `genosis.types`. Import them for type checking:

```python
from genosis.types import (
    CallResult,
    Account,
    AccountUsage,
    ApiKey,
    CreatedApiKey,
    CacheManifest,
    CacheTrainEntry,
    ManifestResponse,
    ManifestHistoryEntry,
    ManifestsListEntry,
    OptimizationResult,
    OptimizationStatusResponse,
    OptimizationResultsResponse,
    TelemetrySummary,
    CostBreakdownEntry,
    BlockFrequency,
    MemoizationCandidate,
)
```

Key types:

**`CallResult`** — returned by `g.call()`
```python
class CallResult(Generic[R]):
    response: R      # the LLM response object, unmodified
    memoized: bool   # True if served from local memo cache
```

**`CacheManifest`** — the server-generated optimization manifest
```python
class CacheManifest(TypedDict, total=False):
    manifest_version: str
    provider: str
    model: str
    cache_train: list[CacheTrainEntry]   # blocks to cache, by hash
    memoization: MemoizationSection
    provider_hints: ProviderHints
```

**`CacheTrainEntry`** — one block in the cache plan
```python
class CacheTrainEntry(TypedDict):
    hash: str        # SHA-256 of the content block
    tokens: int
    priority: float
    position: int
```

**`MemoizationCandidate`** — a request pattern flagged for memoization
```python
class MemoizationCandidate(TypedDict):
    fingerprint: str
    ttl_seconds: int
    block_hashes: list[str]
    estimated_savings_per_hit: float
```

**`MemoStorage`** — abstract base class for custom memo backends
```python
class MemoStorage(ABC):
    def get(self, fingerprint: str) -> Optional[Any]: ...
    def set(self, fingerprint: str, response: Any, ttl_seconds: int) -> None: ...
```

## Logging

The SDK logs to the `genosis` logger at `DEBUG` level. Errors in the optimization layer are also logged there. To see SDK diagnostics:

```python
import logging
logging.getLogger("genosis").setLevel(logging.DEBUG)
```

In production, the default level is `WARNING`, so no output is produced unless something goes wrong.

## License

Apache 2.0 — see [LICENSE](./LICENSE) and [NOTICE](./NOTICE).

Patent pending. All patent inquiries: legal@usegenosis.ai
