Metadata-Version: 2.1
Name: parquet-tools
Version: 0.2.1
Summary: Easy install parquet-tools
Home-page: https://github.com/ktrueda/parquet-tools
License: MIT
Keywords: parquet-tools,parquet
Author: Kentaro Ueda
Author-email: kentaro.ueda.kentaro@gmail.com
Requires-Python: >=3.8,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Requires-Dist: boto3 (>=1.13.25,<2.0.0)
Requires-Dist: halo (>=0.0.29,<0.0.30)
Requires-Dist: pandas (>=1.0.4,<2.0.0)
Requires-Dist: pyarrow (>=0.17.1,<0.18.0)
Requires-Dist: tabulate (>=0.8.7,<0.9.0)
Requires-Dist: twine (>=3.1.1,<4.0.0)
Requires-Dist: wheel (>=0.34.2,<0.35.0)
Project-URL: Repository, https://github.com/ktrueda/parquet-tools
Description-Content-Type: text/markdown

# parquet-tools

![Run Unittest](https://github.com/ktrueda/parquet-tools/workflows/Run%20Unittest/badge.svg)
![Run CLI test](https://github.com/ktrueda/parquet-tools/workflows/Run%20CLI%20test/badge.svg)

This is a pip installable [parquet-tools](https://github.com/apache/parquet-mr).
In other words, parquet-tools is a CLI tools of [Apache Arrow](https://github.com/apache/arrow) or easy install [parquet-tools](https://github.com/apache/parquet-mr).
You can show parquet file content/schema on local disk or on Amazon S3.

## Features

- Read Parquet data (local file or file on S3)
- Read Parquet metadata/schema (local file or file on S3)

## Installation

```bash
$ pip install parquet-tools
```

## Usage

```bash
$ parquet-tools --help
usage: parquet-tools [-h] {show,csv,inspect} ...

parquet CLI tools

positional arguments:
  {show,csv,inspect}
    show              Show human readble format. see `show -h`
    csv               Cat csv style. see `csv -h`
    inspect           Inspect parquet file. see `inspect -h`

optional arguments:
  -h, --help          show this help message and exit
```

## Usage Examples

#### Show local parquet file

```bash
$ parquet-tools show test.parquet
+-------+-------+---------+
|   one | two   | three   |
|-------+-------+---------|
|  -1   | foo   | True    |
| nan   | bar   | False   |
|   2.5 | baz   | True    |
+-------+-------+---------+
```

#### Show parquet file on S3

```bash
$ parquet-tools show s3://bucket-name/prefix/*
+-------+-------+---------+
|   one | two   | three   |
|-------+-------+---------|
|  -1   | foo   | True    |
| nan   | bar   | False   |
|   2.5 | baz   | True    |
+-------+-------+---------+
```

#### Cat CSV parquet and transform [csvq](https://github.com/mithrandie/csvq)

```bash
$ parquet-tools csv s3://bucket-name/test.parquet |csvq "select one, three where three"
+-------+-------+
|  one  | three |
+-------+-------+
| -1.0  | True  |
| 2.5   | True  |
+-------+-------+
```

