Metadata-Version: 2.1
Name: shapeshifter
Version: 1.1.1
Summary: A tool for managing large datasets
Home-page: https://github.com/srp33/ShapeShifter
Author: Piccolo Lab
Author-email: stephen_piccolo@byu.edu
License: MIT
Platform: UNKNOWN
Requires-Dist: pandas
Requires-Dist: pyarrow
Requires-Dist: numpy (==1.15.4)
Requires-Dist: sqlalchemy
Requires-Dist: xlsxwriter
Requires-Dist: tables
Requires-Dist: xlrd
Requires-Dist: nbformat

# shapeshifter Python Module
The official repository for the shapeshifter Python module, which allows for:
* Transforming tabular data sets from one format to another.
* Querying large data sets to filter out useful data.
* Selecting additional columns/features to include in the resulting data set.
* Merging data sets of various formats into a single file.
* Gzipping resulting data sets, as well as the ability to read gzipped files.

Click for information on the [shapeshifter command-line tool](https://github.com/srp33/ShapeShifter-CLI), which combines
the features of shapeshifter with the ease and speed of the command-line!

Basic use is described below, but see the full documentation on [Read the Docs](https://shapeshifter.readthedocs.io/en/latest/).  
## Install
`pip3 install shapeshifter`

## Basic Use
After installing, import the ShapeShifter class with `from shapeshifter import ShapeShifter`. A ShapeShifter object 
represents the file to be transformed. It is then transformed using the `export_filter_results` method. Here is a simple
example of file called `input_file.tsv` being transformed into an HDF5 file called `output_file.h5`, while filtering 
the data on sex and age:
```python
from shapeshifter import ShapeShifter

my_shapeshifter = ShapeShifter("input_file.tsv")
my_shapeshifter.export_filter_results("output_file.h5", filters="Sex == 'M' and Age > 40")
```
Note that the type of file being read and exported to were not stated explicitly but inferred by shapeshifter based on
the file extensions provided. If necessary, `input_file_type` and `output_file_type` can be named explicitly.


## Contributing
We welcome contributions that help expand shapeshifter to be compatible with additional file formats. If you are 
interested in contributing, please follow the instructions [here](https://github.com/srp33/ShapeShifter/wiki).
## Currently Supported Formats
#### Input Formats:
* CSV
* TSV
* JSON
* Excel
* HDF5
* Parquet
* MsgPack
* Stata
* Pickle
* SQLite
* ARFF
* GCT
* Kallisto
* GEO

#### Output Formats:
* CSV 
* TSV
* JSON
* Excel
* HDF5
* Parquet
* MsgPack
* Stata 
* Pickle
* SQLite 
* ARFF 
* GCT 
* RMarkdown 
* JupyterNotebook

## Future Formats to Support
We are working hard to expand ShapeShifter to work with even more file formats! Expect the following formats to be 
included in future releases:
* Fixed-width files (fwf)
* Genomic Data Commons clinical XML


