Metadata-Version: 2.4
Name: pydataio
Version: 1.0.2
Summary: A scalable framework for data input and output operations in Spark applications
Home-page: https://github.com/AmadeusITGroup/PyDataIO
Author: Guillaume LECLERC, Simone DE SANTIS
Author-email: Simone DE SANTIS <simone.desantis@amadeus.com>, Guillaume LECLERC <guillaume.leclerc@amadeus.com>
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: adlfs>=2024.7.0
Requires-Dist: fsspec>=2024.10.0
Requires-Dist: pyyaml==6.0.3
Dynamic: license-file

# PyData I/O

[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Spark](https://img.shields.io/badge/Spark-3.5.2-blue)](https://spark.apache.org/releases/spark-release-3-5-2.html)
[![Python](https://img.shields.io/badge/python-3.11-red)](https://www.python.org/)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)][contributing]

Data I/O is an open source project that provides a flexible and scalable framework for data input and output operations in Spark applications. It offers a set of powerful tools and abstractions to simplify and streamline data processing pipelines.

## Features

- Easy-to-use API for defining data processors and transformations
- Seamless integration with popular data storage systems and formats
- Support for batch and streaming data processing
- Extensible architecture for custom data processors and pipelines
- Scalable and fault-tolerant processing using Apache Spark
- Open to make use of python ML models ecosystem (sklearn, xgboost, pytorch...)

## Getting Started
To get started with PyData I/O, please refer to the [documentation][gettingstarted] for installation instructions, usage examples, and API references.

## Issues and Support
If you encounter any issues or require support, please create a new issue on the [GitHub repository][issues].

## Contribution
Contributions to Data I/O are welcome! To contribute, please follow the guidelines outlined in [our contribution guide][contributing].

## License
This project is licensed under the Apache License 2.0 license. See the [LICENSE][license] file for more information.

[gettingstarted]: https://amadeusitgroup.github.io/PyDataIO/getting-started.html
[documentation]: https://amadeusitgroup.github.io/PyDataIO/
[contributing]: CONTRIBUTING.md
[codeofconduct]: CODE_OF_CONDUCT.md
[license]: LICENSE
[repository]: https://github.com/AmadeusITGroup/PyDataIO
[issues]: https://github.com/AmadeusITGroup/PyDataIO/issues
