Metadata-Version: 2.1
Name: deltacat
Version: 0.2.2
Summary: A scalable, fast, ACID-compliant Data Catalog powered by Ray.
Home-page: https://github.com/ray-project/deltacat
Author: Ray Team
License: UNKNOWN
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: boto3 ~=1.20
Requires-Dist: numpy ==1.21.5
Requires-Dist: pandas ==1.3.5
Requires-Dist: pyarrow ==12.0.1
Requires-Dist: pydantic ==1.10.4
Requires-Dist: ray[default] ~=2.0
Requires-Dist: s3fs ==2022.2.0
Requires-Dist: tenacity ==8.1.0
Requires-Dist: typing-extensions ==4.4.0
Requires-Dist: pymemcache ==4.0.0
Requires-Dist: redis ==4.6.0
Requires-Dist: getdaft ==0.1.17
Requires-Dist: schedule ==1.2.0

# DeltaCAT

DeltaCAT is a Pythonic Data Catalog powered by Ray.

Its data storage model allows you to define and manage fast, scalable,
ACID-compliant data catalogs through git-like stage/commit APIs, and has been
used to successfully host exabyte-scale enterprise data lakes.

DeltaCAT uses the Ray distributed compute framework together with Apache Arrow
for common table management tasks, including petabyte-scale
change-data-capture, data consistency checks, and table repair.

## Getting Started

### Install

```
pip install deltacat
```

### Running Tests

```
pip3 install virtualenv
virtualenv test_env
source test_env/bin/activate
pip3 install -r requirements.txt

pytest
```


