Metadata-Version: 2.4
Name: datasynth-py
Version: 0.2.1
Summary: Python wrapper for DataSynth synthetic data generation
Author-email: EY ASU RnD <michael.ivertowski@ch.ey.com>
Maintainer-email: EY ASU RnD <michael.ivertowski@ch.ey.com>
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/ey-asu-rnd/SyntheticData
Project-URL: Documentation, https://ey-asu-rnd.github.io/SyntheticData/
Project-URL: Repository, https://github.com/ey-asu-rnd/SyntheticData
Project-URL: Changelog, https://github.com/ey-asu-rnd/SyntheticData/blob/main/CHANGELOG.md
Project-URL: Issues, https://github.com/ey-asu-rnd/SyntheticData/issues
Keywords: synthetic-data,data-generation,testing,machine-learning,financial-data,accounting,journal-entries,fraud-detection
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Testing :: Mocking
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Provides-Extra: cli
Requires-Dist: PyYAML>=6.0; extra == "cli"
Provides-Extra: memory
Requires-Dist: pandas>=2.0; extra == "memory"
Provides-Extra: streaming
Requires-Dist: websockets>=12.0; extra == "streaming"
Provides-Extra: all
Requires-Dist: PyYAML>=6.0; extra == "all"
Requires-Dist: pandas>=2.0; extra == "all"
Requires-Dist: websockets>=12.0; extra == "all"
Provides-Extra: dev
Requires-Dist: PyYAML>=6.0; extra == "dev"
Requires-Dist: pandas>=2.0; extra == "dev"
Requires-Dist: websockets>=12.0; extra == "dev"
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"

# datasynth-py

Python wrapper for the DataSynth synthetic data generator.

## Installation

### From PyPI

```bash
pip install datasynth-py[all]
```

Or install specific extras:

```bash
pip install datasynth-py           # Core only (no dependencies)
pip install datasynth-py[cli]      # CLI generation (PyYAML)
pip install datasynth-py[memory]   # In-memory tables (pandas)
pip install datasynth-py[streaming] # Streaming (websockets)
pip install datasynth-py[all]      # All optional dependencies
```

### From Source

```bash
cd python
pip install -e ".[all]"
```

## Quick Start

```python
from datasynth_py import DataSynth, CompanyConfig, Config, GlobalSettings, ChartOfAccountsSettings

config = Config(
    global_settings=GlobalSettings(
        industry="retail",
        start_date="2024-01-01",
        period_months=12,
    ),
    companies=[
        CompanyConfig(code="C001", name="Retail Corp", currency="USD", country="US"),
    ],
    chart_of_accounts=ChartOfAccountsSettings(complexity="small"),
)

synth = DataSynth()
result = synth.generate(config=config, output={"format": "csv", "sink": "temp_dir"})
print(result.output_dir)
```

## Using Blueprints

```python
from datasynth_py import DataSynth
from datasynth_py.config import blueprints

config = blueprints.retail_small(companies=4, transactions=10000)
synth = DataSynth()
result = synth.generate(config=config, output={"format": "parquet", "sink": "path", "path": "./output"})
```

## Requirements

The wrapper shells out to the `datasynth-data` CLI binary. Build it with:

```bash
cargo build --release
export DATASYNTH_BINARY=target/release/datasynth-data
```

Or pass `binary_path` when creating the client:

```python
synth = DataSynth(binary_path="/path/to/datasynth-data")
```

## Documentation

See the [Python Wrapper Guide](../docs/src/user-guide/python-wrapper.md) for complete documentation.

## License

Apache 2.0 License - see the main project LICENSE file.
