Metadata-Version: 2.1
Name: gpam_training
Version: 0.0.7
Summary: Module to facilitate the integration of a sklearn training pipeline into a deploy and retraining system
Home-page: https://github.com/joaorobson/gpam_training
License: UNKNOWN
Author: João Robson
Description-Content-Type: text/x-rst
Classifier: License :: OSI Approved :: Apache Software License
Requires-Dist: numpy==1.17.4
Requires-Dist: pandas==0.24.2
Requires-Dist: scikit-learn==0.22


gpam_training
=============

Module to facilitate the integration of a sklearn training pipeline into a deploy and retraining system

Install
-------

.. code-block::

   pip install gpam_training

Usage
-----

Multilabel training
^^^^^^^^^^^^^^^^^^^

First of all, it is needed to have in memory a dataframe from pandas.
The csv must be in the following format:

.. code-block::

   process_id,page_text_extract,tema
   1,Lorem ipsum dolor sit amet,1
   2,Lorem ipsum dolor sit amet,2
   2,Lorem ipsum dolor sit amet,3
   42,Lorem ipsum dolor sit amet,2

To train the model, do as shown bellow:

.. code-block::

   from gpam_training import MultilabelTraining
   import pandas as pd

   df = pd.read_csv('example.csv')
   model = MultilabelTraining(df)
   model.train()

To dump a pickle file with the trained model, do the following:

.. code-block::

   model_pickle = model.get_pickle()

