Metadata-Version: 2.4
Name: dataxid
Version: 0.1.0
Summary: The Synthetic Data API — privacy-preserving synthetic data generation
Project-URL: Homepage, https://dataxid.com
Project-URL: Documentation, https://docs.dataxid.com
Project-URL: Repository, https://github.com/dataxid/dataxid-python
Project-URL: Issues, https://github.com/dataxid/dataxid-python/issues
Project-URL: Changelog, https://github.com/dataxid/dataxid-python/blob/main/CHANGELOG.md
Author-email: Dataxid <dev@dataxid.com>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: differential-privacy,machine-learning,privacy,privacy-by-architecture,synthetic-data
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27.0
Requires-Dist: numpy>=2.0.0
Requires-Dist: pandas>=2.2.0
Requires-Dist: pyarrow>=13.0.0
Requires-Dist: torch<2.10.0,>=2.9.0
Description-Content-Type: text/markdown

# Dataxid Python SDK

[![PyPI version](https://img.shields.io/pypi/v/dataxid)](https://pypi.org/project/dataxid/)
[![Python versions](https://img.shields.io/pypi/pyversions/dataxid)](https://pypi.org/project/dataxid/)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue)](https://github.com/dataxid/dataxid-python/blob/main/LICENSE)

Privacy-preserving synthetic data generation, built on a privacy-by-architecture principle. Your raw data never leaves your machine — only abstract embeddings are shared with the API.

## Installation

```bash
pip install dataxid
```

## Quick Start

```python
import dataxid
import pandas as pd

dataxid.api_key = "dx_..."

df = pd.read_csv("data.csv")
synthetic = dataxid.synthesize(data=df, n_samples=1000)
```

## Full Control

```python
import dataxid
import pandas as pd

dataxid.api_key = "dx_..."

df = pd.read_csv("data.csv")

model = dataxid.Model.create(data=df)
synthetic = model.generate(n_samples=1000)
model.delete()
```

## Error Handling

```python
import dataxid

try:
    synthetic = dataxid.synthesize(data=df)
except dataxid.AuthenticationError:
    print("Invalid API key")
except dataxid.QuotaExceededError as e:
    print(f"Quota exceeded. Upgrade: {e.upgrade_url}")
except dataxid.RateLimitError as e:
    print(f"Rate limited. Retry after: {e.retry_after}s")
except dataxid.DataxidError as e:
    print(f"Error: {e}")
```

## How It Works

Dataxid is built on a **privacy-by-architecture** principle. Data encoding and decoding happen entirely on your machine; only abstract embeddings are shared with the API for model training. Raw data never leaves your environment.

## Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `embedding_dim` | `64` | Embedding size (larger = more expressive) |
| `model_size` | `"medium"` | Model capacity: `"small"`, `"medium"`, `"large"` |
| `max_epochs` | `100` | Maximum training epochs |
| `batch_size` | `256` | Training batch size |
| `privacy_enabled` | `False` | Add noise to embeddings for privacy |
| `privacy_noise` | `0.1` | Noise scale (Gaussian std) |

```python
model = dataxid.Model.create(
    data=df,
    config=dataxid.ModelConfig(
        embedding_dim=128,
        model_size="large",
        max_epochs=50,
    ),
)
```

Plain dict also works for quick experiments:

```python
model = dataxid.Model.create(
    data=df,
    config={"embedding_dim": 128, "max_epochs": 50},
)
```

## Links

- [Documentation](https://docs.dataxid.com)
- [API Reference](https://docs.dataxid.com/api)
- [GitHub](https://github.com/dataxid/dataxid-python)
- [Examples](examples/quickstart.py)
