Metadata-Version: 2.1
Name: pydtc
Version: 0.0.4
Summary: data engineer tools collection
Home-page: https://github.com/cctester/pydtc
Author: cctester
Author-email: cctester2001@gmail.com
License: UNKNOWN
Description: This package provide universal tools to connect all kinds of database
        via JDBC, using Fast/Batch load technology to speed the temporary table 
        creation and query as well.
        
        It also provide the multiprocessing capablity to pandas dataframe when dealing with cpu intensive operation on large volume data.
        
        sample usage:
        
            ## connect to mysql
                import pydtc
        
                conn = pydtc.connect('mysql', '127.0.0.1', 'user', 'pass', database='demo')
                pydtc.read_sql(conn, 'select * from demo.sample')ß
                conn.close()
            
            ### or use with
                with pydtc.connect('mysql', '127.0.0.1', 'user', 'pass', database='demo') as conn:
                    conn.read_sql('select * from demo.sample')
                    # pydtc.read_sql(conn, 'select * from demo.sample')
        
            ## pandas multiprocessing groupby then apply
                def func(df, key, value):
                    dd = {key : value}
                    dd['some_key'] = [len(df.other_key)]
        
                    return pd.DataFrame(dd)
        
                new_df = pydtc.p_groupby_apply(func, df, 'group_key')
        
        
Keywords: pandas,multiprocessing,database
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Description-Content-Type: text/markdown
