Metadata-Version: 2.1
Name: lichens
Version: 0.1.0
Summary: ETL framework for Colosscious Pharmquer
License: MIT
Author: thisishugow
Author-email: 59921505+thisishugow@users.noreply.github.com
Requires-Python: >=3.10,<3.13
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: alembic (>=1.12.1,<2.0.0)
Requires-Dist: click (>=8.1.7,<9.0.0)
Requires-Dist: crontab (>=1.0.1,<2.0.0)
Requires-Dist: pandas (>=2.1.3,<3.0.0)
Requires-Dist: pandera (>=0.17.2,<0.18.0)
Requires-Dist: pendulum (==2.1.2)
Requires-Dist: psycopg-binary (>=3.1.12,<4.0.0)
Requires-Dist: psycopg2-binary (>=2.9.9,<3.0.0)
Requires-Dist: sqlalchemy (>=2.0.23,<3.0.0)
Description-Content-Type: text/markdown

# lichens
ETL framework for Colosscious Pharmquer.  
It contains a Web UI and a backend. See [Colosscious Official Website](https://www.colosscious.com) for more. 

## Installation 
```
pip install lichens
```

## Example
### Setup 
```bash
# Command
python -m lichens --help
# Output:
# Usage: python -m lichens [OPTIONS] COMMAND [ARGS]...
# 
# Options:
#   --help  Show this message and exit.
# 
# Commands:
#   add-etl       Add an ETL program setting from a JSON
#   get-template  Generate an ETL program setting to a JSON
#   migrate       Make and migrate the system tables.

# Initialize the database if needed. 
python -m lichens migrate \
-c driver://user:pass@localhost:port/dbname 


# Create an ETL setting template
python -m lichens get-template \


# Create it in the database
python -m lichens add-etl \
--connection-string driver://user:pass@localhost:port/dbname \
--json-config etl-setting.json
```
### Write an ETL
See [full example](https://github.com/thisishugow/lichens/blob/main/example/example_load-csv.py) 

```python
# Create an ETL Manager
from lichens import EtlManager
em = EtlManager(constr=CONNECTION_STRING, name=ETL_NAME)

# Use Pandera to validate data
from lichens import DataFrameSchema, check_io
schema = DataFrameSchema(...)
validated_df = schema(df)

## Load df to datebase
em.load_df(
        df=df,
        tablename="sample_table",
        schema="public",
        if_exists="raise_error",
        chunksize=500,
        unique_key=["column1", "column2"],
    )

# Update log and archive file
em.update_status(
        filename=f, 
        status=_status, 
        last_log=_last_log)
em.move(
    src=os.path.join(em.src_folder, f),
    status=_status)

# Use state highlighted logger
from lichens.logging import logger as log

# Run as scheduled
## Define a function to be executed
def your_function():
    # Your function logic here
    pass

# Run the function every minute for a total of 5 times
em.run_as_schtask(your_function, '*/1 * * * *', times_=5)
## Decorate the function with the scheduled decorator
@em.scheduled(crontab='*/1 * * * *', times_=5)
def your_scheduled_function():
    # Your function logic here
    pass
```

