Metadata-Version: 2.4
Name: kafkagraph
Version: 0.1.4
Summary: Licensed Kafka to Neo4j Graph Ingestion Engine for turn-key graph building
Author: Siddhappa Birajdar
License-Expression: LicenseRef-Proprietary
Project-URL: Homepage, https://www.linkedin.com/in/siddhappabirajdar
Project-URL: Source, https://github.com/SIDDHAPPA/kafkagraph
Project-URL: Documentation, https://pypi.org/project/kafkagraph
Keywords: kafka,neo4j,graph,ingestion,etl,data-engineering,python-sdk
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: kafka-python
Requires-Dist: neo4j
Requires-Dist: pyyaml
Dynamic: license-file

# KafkaGraph

Licensed Kafka to Neo4j ingestion SDK.

## Overview

KafkaGraph is a production-grade SDK that ingests events from Kafka and materializes them into a Neo4j graph. It supports single API key authentication and pluggable mapping modes, and is suitable for embedding into customer systems.

## Problem It Solves

- Converts high-volume Kafka event streams into navigable Neo4j graph structures without bespoke pipelines.
- Provides a consistent ingestion engine with batching, error safety, and transactional commits.
- Enforces enterprise-grade licensing, rate limits, and partition caps to align with commercial agreements.
- Offers simple-to-extend mapping modes to transform JSON events to nodes and relationships rapidly.

## Features

- License enforcement with signed license files and machine fingerprint binding.
- API key mode for testing with default deterministic keys.
- Feature gating: `simple`, `autograph`, `sequence` mappers.
- Batching for efficient writes and offset commits.
- Partition monitoring to enforce per-topic and total caps.
- Extensible dispatcher for new mapping modes and features.

## Install

```bash
pip install .
```

## Usage

```python
from kafkagraph import KafkaGraph

kg = KafkaGraph(
    license_file="license.json",
    kafka_config={"brokers": ["localhost:9092"], "group_id": "kafkagraph"},
    neo4j_config={"uri": "bolt://localhost:7687", "user": "neo4j", "password": "pass"},
    topics_config_path="topics.yaml",
    batch_size=500
)

kg.start()
```

## Topics Configuration

Provide a `topics.yaml` that declares how events should be mapped:

```yaml
orders:
  mode: simple
  nodes:
    order:
      label: Order
      id: orderId
    customer:
      label: Customer
      id: customerId
  relationships:
    - type: PLACED_BY
      from: order
      to: customer
      properties: [createdAt, source]

profiles:
  mode: sequence
  base:
    label: User
    id: userId
  sequences:
    - field: devices
      label: Device
      id_field: deviceId
      type: OWNS
      properties: [model, os]
```

## API Key Authentication

Set a single API key via environment:

Environment:
```bash
export KAFKAGRAPH_API_KEY="your_api_key"
```

Generate a key via CLI:
```bash
kafkagraph-keygen              # prints a new key
kafkagraph-keygen ./apikey.txt # writes key to file and prints it
```

Use with API key:
```python
from kafkagraph import KafkaGraph

kg = KafkaGraph(
    api_key="your_api_key",  # must match KAFKAGRAPH_API_KEY
    kafka_config={"brokers": ["localhost:9092"], "group_id": "kafkagraph"},
    neo4j_config={"uri": "bolt://localhost:7687", "user": "neo4j", "password": "pass"},
    topics_config_path="topics.yaml",
    batch_size=500
)

kg.start()
```

If the environment variable is not set or the provided key does not match, an error is raised.

## Neo4j Write Semantics

- Nodes are merged by `label` and `id`.
- Relationships are merged by `type` and endpoints, then properties are set from each event batch.
- Writes happen when `batch_size` is reached, followed by consumer offset commits.

## Extensibility

- Add new mapping modes under `kafkagraph/mappers/` and extend `core/dispatcher.py`.
- Adjust license limits and feature flags per enterprise tier in license manager classes.
- Swap batching strategy (size/time) by modifying `core/batcher.py`.

## Security

- Do not hardcode private keys. `PUBLIC_KEY_B64` must be provided securely in deployments.
- API keys can be loaded via environment or file; rotate keys as needed.

## Author

Siddhappa Birajdar  
Software Developer @FirstCry.com | Ex-Founder & CEO @Punarvspace Inc | Entrepreneurship | Python, Django, Flask, FastAPI, AI/ML, Generative AI, Quantum Computing, Blockchain | AWS Certified Solution Architect Associate  
LinkedIn: https://www.linkedin.com/in/siddhappabirajdar
