Metadata-Version: 2.1
Name: file-stream
Version: 0.0.3
Summary: process data as stream.
Home-page: https://github.com/nutalk/file_stream
Author: Nutalk
Author-email: ht2005112@hotmail.com
License: BSD
Keywords: file
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: CPython
Requires-Dist: mysql-connector

=============
流式处理数据
=============

从目录读取所有文件，从csv读取所有数据，对数据计算后，写入csv、数据库等。

原理说明
=============

编程主要要到了生成器，各个类用for循环从上游抽取数据，用yield给下游提供数据。通过改写or规则，将各个组建组合起来。

参考项目
============

整体思路主要参考了这个项目：https://github.com/sandabuliu/python-stream。

安装
========
>>> pip install file-stream


使用
========
写数据到数据库。

::

    from file_stream.executor.source import Memory
    from file_stream.executor.writer import MysqlWriter

    office_base_config = {
        'host': "",
        'user': "",
        'passwd': '',
        'database': '',
        'charset': '',
    }

    datas = [{'f_cuid': 'id2', 'f_sentence_no': 1, 'f_pos_no': 1, 'f_neg_no': 0, 'f_nu_no': 0},
             {'f_cuid': 'id3', 'f_sentence_no': 3, 'f_pos_no': 2, 'f_neg_no': 1, 'f_nu_no': 0},
             {'f_cuid': 'id1', 'f_sentence_no': 1, 'f_pos_no': 1, 'f_neg_no': 0, 'f_nu_no': 0},
             {'f_cuid': 'id4', 'f_sentence_no': 1, 'f_pos_no': 1, 'f_neg_no': 0, 'f_nu_no': 0}, ]
    reader = Memory(datas)
    p = reader | MysqlWriter(office_base_config, 't_report_info')
    p.output()


