Metadata-Version: 2.1
Name: chunkr
Version: 0.1.0
Summary: A library for chunking different types of data files.
Home-page: https://github.com/1b5d/chunkr
License: MIT
Author: 1b5d
Author-email: 8110504+1b5d@users.noreply.github.com
Requires-Python: >=3.7.1,<4.0.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: fsspec (>=2022.7.1,<2023.0.0)
Requires-Dist: pandas (>=1.3.5,<2.0.0)
Requires-Dist: paramiko (>=2.11.0,<3.0.0)
Requires-Dist: pyarrow (>=8.0.0,<9.0.0)
Project-URL: Repository, https://github.com/1b5d/chunkr
Description-Content-Type: text/markdown

# chunkr
[![PyPI version][pypi-image]][pypi-url]
<!-- [![Build status][build-image]][build-url] -->
<!-- [![Code coverage][coverage-image]][coverage-url] -->
[![GitHub stars][stars-image]][stars-url]
[![Support Python versions][versions-image]][versions-url]


A library for chunking different types of data files.

## Getting started

```bash
pip install chunkr
```

## Usage

Suppose you want to chunk a csv file of 1 million records into 10 pieces, you can do this

```py
from chunkr import create_chunks_dir
import pandas as pd

with create_chunks_dir(
            'csv',
            'csv_test',
            'path/to/file',
            'temp/output',
            100_000,
            None,
            None,
            quote_char='"',
            delimiter=',',
            escape_char='\\',
    ) as chunks_dir:

        assert 1_000_000 == sum(
            len(pd.read_parquet(file)) for file in chunks_dir.iterdir()
        )

```


<!-- Badges -->

[pypi-image]: https://img.shields.io/pypi/v/chunkr
[pypi-url]: https://pypi.org/project/chunkr/
[build-image]: https://github.com/1b5d/chunkr/actions/workflows/build.yaml/badge.svg
[build-url]: https://github.com/1b5d/chunkr/actions/workflows/build.yaml
[coverage-image]: https://codecov.io/gh/1b5d/chunkr/branch/main/graph/badge.svg
[coverage-url]: https://codecov.io/gh/1b5d/chunkr/
[stars-image]: https://img.shields.io/github/stars/1b5d/chunkr
[stars-url]: https://github.com/1b5d/chunkr
[versions-image]: https://img.shields.io/pypi/pyversions/chunkr
[versions-url]: https://pypi.org/project/chunkr/

