Metadata-Version: 2.1
Name: dataliner
Version: 1.2.0
Summary: A dataprocessing package for data preprocess and feature engineering.
Home-page: https://github.com/shallowdf20/dataliner
Author: Shallowdf
Author-email: shallowdf20@gmail.com
License: UNKNOWN
Project-URL: Bug Reports, https://github.com/shallowdf20/dataliner/issues
Project-URL: Source, https://github.com/shallowdf20/dataliner
Keywords: data processing machine learning preprocess feature engineering
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.5
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: scikit-learn

## DataLiner - Data processing package for Python 
A dataprocessing package for data preprocess and feature engineering.<br>
Please feel free to send pull requests for bug fix, improvements or new preprocessing methods!

## Installation
```
! pip install dataliner
```

## Documentation
https://shallowdf20.github.io/dataliner/preprocessing.html

## Quick Start
Train data from Kaggle Titanic is used in this example. https://www.kaggle.com/c/titanic/data

```python
import pandas as pd
from sklearn.pipeline import make_pipeline
import dataliner as dl

df = pd.read_csv('train.csv')
target_col = 'Survived'
X = df.drop(target_col, axis=1)
y = df[target_col]

process = make_pipeline(
    dl.DropNoVariance(),
    dl.DropHighCardinality(),
    dl.BinarizeNaN(),
    dl.ImputeNaN(),
    dl.TargetMeanEncoding(),
    dl.DropHighCorrelation(),
    dl.StandardScaling(),
    dl.DropLowAUC(),
)

process.fit_transform(X, y)

```


