Metadata-Version: 2.4
Name: veculo
Version: 0.2.5
Summary: Python SDK for Veculo — AI-native multi-modal graph+vector database
Project-URL: Homepage, https://veculo.com
Project-URL: Documentation, https://docs.veculo.com
Project-URL: Repository, https://github.com/sentrius/veculo-python
Project-URL: Issues, https://github.com/sentrius/veculo-python/issues
Author-email: Sentrius LLC <support@sentrius.ai>
License-Expression: Apache-2.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: httpx>=0.24
Requires-Dist: pydantic>=2.0
Provides-Extra: all
Requires-Dist: google-cloud-aiplatform>=1.38; extra == 'all'
Requires-Dist: openai>=1.0; extra == 'all'
Requires-Dist: sentence-transformers>=2.2; extra == 'all'
Provides-Extra: local
Requires-Dist: sentence-transformers>=2.2; extra == 'local'
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Provides-Extra: vertexai
Requires-Dist: google-cloud-aiplatform>=1.38; extra == 'vertexai'
Description-Content-Type: text/markdown

<!--

    Licensed to the Apache Software Foundation (ASF) under one
    or more contributor license agreements.  See the NOTICE file
    distributed with this work for additional information
    regarding copyright ownership.  The ASF licenses this file
    to you under the Apache License, Version 2.0 (the
    "License"); you may not use this file except in compliance
    with the License.  You may obtain a copy of the License at

      https://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing,
    software distributed under the License is distributed on an
    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    KIND, either express or implied.  See the License for the
    specific language governing permissions and limitations
    under the License.

-->
# Veculo Python SDK

Python client for [Veculo](https://veculo.com) — a managed graph+vector database built on Apache Accumulo.

## Installation

```bash
pip install veculo
```

## Quick Start

### With auto-generated embeddings (easiest)

```python
from veculo import VeculoClient

client = VeculoClient(api_key="vk-...", cluster_id="cl-a7f3b2")

# Insert vertices — Veculo generates embeddings from text automatically
client.put_vertex_with_text(
    id="doc-1",
    text="Q1 revenue exceeded expectations with 40% YoY growth driven by enterprise expansion",
    label="document",
    properties={"author": "Alice", "quarter": "Q1"},
    embed_server_side=True,
)

client.put_vertex_with_text(
    id="doc-2",
    text="Project Plan for Q2 focuses on APAC market entry and partner channel development",
    label="document",
    properties={"author": "Bob", "quarter": "Q2"},
    embed_server_side=True,
)

# Create edges
client.put_edge(source="doc-1", target="doc-2", edge_type="references")

# Ask questions in natural language — answers grounded in your graph
answer = client.rag_query(
    question="What drove Q1 growth and what's planned for Q2?",
    context_hops=2,
)
print(answer["answer"])    # LLM-synthesized answer with citations
print(answer["sources"])   # ["doc-1", "doc-2"]
```

### With your own embeddings

```python
from veculo import VeculoClient

client = VeculoClient(api_key="vk-...", cluster_id="cl-a7f3b2")

# Insert vertices with pre-computed embedding vectors
client.put_vertex(
    id="doc-1",
    label="document",
    properties={"title": "Quarterly Report", "author": "Alice"},
    embedding=[0.12, 0.45, 0.78, 0.33, 0.21, 0.56, 0.89, 0.12],
    visibility="INTERNAL",
)

client.put_vertex(
    id="doc-2",
    label="document",
    properties={"title": "Project Plan", "author": "Bob"},
    embedding=[0.11, 0.44, 0.80, 0.31, 0.19, 0.58, 0.87, 0.14],
)

# Create edges
client.put_edge(
    source="doc-1",
    target="doc-2",
    edge_type="references",
    properties={"section": "appendix"},
)

# Hybrid query: vector similarity + graph traversal
results = client.query(
    embedding=[0.12, 0.44, 0.79, 0.32, 0.15, 0.67, 0.23, 0.91],
    top_k=5,
    edge_type="references",
    depth=2,
    authorizations="INTERNAL",
)

for match in results["results"]:
    print(f"{match['vertex_id']}: {match['score']:.3f}")
```

## Environment Variables

Instead of passing credentials to the constructor, you can set:

| Variable | Description |
|---|---|
| `VECULO_API_KEY` | API key for authentication |
| `VECULO_ENDPOINT` | API endpoint (default: `https://api.veculo.com`) |
| `VECULO_CLUSTER_ID` | Target cluster ID |

```python
# With env vars set, no arguments needed:
client = VeculoClient()
```

## CLI

The SDK includes a command-line interface:

```bash
# Save connection configuration
veculo connect --endpoint https://api.veculo.com --api-key vk-... --cluster-id cl-a7f3b2

# Check cluster status
veculo status

# Insert a vertex
veculo put-vertex --id alice --label person --property name=Alice --property role=engineer

# Retrieve a vertex
veculo get-vertex --id alice

# Create an edge
veculo put-edge --source alice --target bob --type knows

# Run a hybrid query
veculo query --embedding "0.1,0.2,0.3,0.4" --top-k 10
```

Configuration is stored in `~/.veculo/config.json`.

## Error Handling

```python
from veculo import VeculoClient, VeculoError, NotFoundError, AuthenticationError

client = VeculoClient(api_key="vk-...", cluster_id="cl-a7f3b2")

try:
    vertex = client.get_vertex(id="nonexistent")
except NotFoundError:
    print("Vertex does not exist")
except AuthenticationError:
    print("Invalid or expired API key")
except VeculoError as e:
    print(f"API error {e.status_code}: {e.message}")
```

## Visibility Labels

Veculo supports Accumulo-style cell-level security via visibility expressions:

```python
# Write with visibility
client.put_vertex(
    id="doc:internal-report",
    label="document",
    properties={"title": "Q1 Revenue Analysis"},
    visibility="finance&internal",
)

# Read with authorizations
vertex = client.get_vertex(
    id="doc:internal-report",
    authorizations="finance,internal",
)
```

## Embeddings

Veculo supports multiple ways to generate vector embeddings:

### Client-side (bring your own API key)

```python
from veculo import VeculoClient
from veculo.embeddings import OpenAIEmbeddings

client = VeculoClient(api_key="vk-...", cluster_name="production")
client.set_embedder(OpenAIEmbeddings(api_key="sk-..."))

# Automatically generates embedding from text
client.put_vertex_with_text(
    id="doc:report-q1",
    text="Q1 revenue exceeded expectations with 40% YoY growth",
    label="document",
    properties={"quarter": "Q1", "year": "2026"},
)
```

Other providers:

```python
from veculo.embeddings import VertexAIEmbeddings, SentenceTransformerEmbeddings

# Google Vertex AI
client.set_embedder(VertexAIEmbeddings(project="my-project"))

# Local (no API key needed)
client.set_embedder(SentenceTransformerEmbeddings())
```

Install extras: `pip install 'veculo[openai]'`, `pip install 'veculo[vertexai]'`, or `pip install 'veculo[local]'`

### Server-side (Veculo-managed, billed separately)

```python
# Veculo generates the embedding for you via Vertex AI
client.put_vertex_with_text(
    id="doc:report-q1",
    text="Q1 revenue exceeded expectations",
    label="document",
    embed_server_side=True,  # billed per request
)
```

## Multi-Modal Knowledge Graphs

Upload any file — Veculo automatically extracts text, generates embeddings, discovers entities, and builds a knowledge subgraph.

### Supported file types

| Type | What Veculo extracts |
|------|---------------------|
| PDF | Text, citations, entities, embeddings |
| Images | Visual description, objects, entities, embeddings |
| Audio | Transcript, entities, embeddings |
| Video | Audio transcript, entities, embeddings |
| Code | Functions, classes, imports, embeddings |

### Upload a file

```python
# Upload a PDF — Veculo does the rest
client.put_vertex_with_file(
    id="paper:arxiv-2401",
    file_path="attention-is-all-you-need.pdf",
    label="paper",
    properties={"source": "arxiv"},
)

# Upload an image
client.put_vertex_with_file(
    id="img:brain-scan-001",
    file_path="brain-scan.png",
    label="medical-image",
)

# Upload source code
client.put_vertex_with_file(
    id="code:transformer",
    file_path="transformer.py",
    label="code",
)
```

### Check extraction status

```python
jobs = client.list_jobs()
for job in jobs["jobs"]:
    print(f"{job['vertex_id']}: {job['status']}")
```

### CLI

```bash
veculo upload --id paper-1 --file paper.pdf --label paper
veculo jobs
veculo get-vertex --id paper-1
```

## AI-Native Queries

### Natural Language Query

Ask questions in plain English — the SDK translates them into graph queries via LLM:

```python
result = client.nl_query(
    question="Which documents reference the Q1 report?",
    authorizations="internal",
)

print(result["query_plan"]["explanation"])
for step_result in result["results"]:
    print(step_result)
```

### Graph-Augmented RAG

Retrieval-Augmented Generation that combines vector search with graph context:

```python
answer = client.rag_query(
    question="What were the key findings in the Q1 analysis?",
    context_hops=2,          # expand graph 2 hops for richer context
    model="claude-sonnet-4-20250514",  # optional model override
    top_k=10,
)

print(answer["answer"])
print("Sources:", answer["sources"])  # vertex IDs cited
```

## Bulk Operations

Insert many vertices or edges in a single batch:

```python
client.put_vertices_bulk([
    {"id": "doc:1", "label": "document", "properties": {"title": "Report A"}},
    {"id": "doc:2", "label": "document", "properties": {"title": "Report B"}},
    {"id": "doc:3", "label": "document", "properties": {"title": "Report C"}},
])

client.put_edges_bulk([
    {"source": "doc:1", "target": "doc:2", "edge_type": "references"},
    {"source": "doc:2", "target": "doc:3", "edge_type": "references"},
])
```

## Hibernate / Resume

Stop compute costs while preserving all data in GCS:

```python
# Hibernate — flushes tables, snapshots metadata, tears down compute
client.hibernate()
# GCS storage continues at ~$0.02/GB/month, compute costs stop immediately

# Later — resume with all data intact
client.resume()
```

Data, tablet metadata, embeddings, and edges are all preserved. Only compute is stopped.

## Configuration

### Auto-Embed

Enable automatic embedding generation for new text vertices:

```python
client.configure_auto_embed(
    model="text-embedding-005",
    provider="vertex-ai",
    text_properties=["description", "content"],
)
```

### Semantic Edges

Enable automatic similarity edge creation during compaction:

```python
client.configure_semantic_edges(
    similarity_threshold=0.85,
    max_edges_per_vertex=10,
)
```

## Insights

Query AI-derived analytics:

```python
# Anomalous vertices (outliers by embedding distance)
anomalies = client.get_anomalies(authorizations="internal")

# Top vertices by PageRank
ranks = client.get_top_ranked()

# Pending processing queue status
status = client.get_processing_status()
print(f"Embeddings pending: {status['auto_embed']}")
```

## License

Apache License 2.0
