Metadata-Version: 2.1
Name: kg-chat
Version: 0.1.2
Summary: kg-chat
License: MIT
Author: Harshad Hegde
Author-email: hhegde@lbl.gov
Requires-Python: >=3.9,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: click
Requires-Dist: dash (>=2.17.1,<3.0.0)
Requires-Dist: duckdb (>=1.0.0,<2.0.0)
Requires-Dist: duckdb-engine (>=0.13.0,<0.14.0)
Requires-Dist: importlib-metadata (>=4.8.0)
Requires-Dist: langchain (>=0.2.7,<0.3.0)
Requires-Dist: langchain-community (>=0.2.7,<0.3.0)
Requires-Dist: langchain-openai (>=0.1.14,<0.2.0)
Requires-Dist: neo4j (>=5.22.0,<6.0.0)
Requires-Dist: pyvis (>=0.3.2,<0.4.0)
Description-Content-Type: text/markdown

# kg-chat

LLM-based chatbot that queries and visualizes [`KGX`](https://github.com/biolink/kgx) nodes and edges TSV files loaded into either [`DuckDB`](https://github.com/duckdb/duckdb) (default) or [`neo4j`](https://github.com/neo4j/neo4j) database backend.

> **_NOTE:_**  
> Ensure `OPENAI_API_KEY` is set as an environmental variable.

## Setup

### For Neo4j Backend (Optional)
1. Install Neo4j desktop from [here](https://neo4j.com/download/).
2. Create a new project and database, then start it.
3. Install the APOC plugin in Neo4j Desktop.
4. Update settings to match [`neo4j_db_settings.conf`](conf_files/neo4j_db_settings.conf).

### General Setup 

#### For Developers 
1. Clone this repository.
2. Create a virtual environment and install dependencies:
    ```shell
    cd kg-chat
    pip install poetry
    poetry install
    ```
3. Replace [`data/nodes.tsv`](data/nodes.tsv) and [`data/edges.tsv`](data/edges.tsv) with desired KGX files if needed.

### For using kg-chat as a dependency

```shell
pip install kg-chat
```
OR
```shell
poetry add kg-chat@latest
```

### Supported Backends
- DuckDB [default]
- Neo4j

### Commands

1. **Import KG**: Load nodes and edges into a database (default: duckdb).
    ```shell
    poetry run kg import --data-dir data
    ```

2. **Test Query**: Run a test query. 
   > NOTE: `--data-dir` is a required parameter for all commands. This is the path for the directory which contains the nodes.tsv and edges.tsv file. The filenames are expected to be exactly that.
    ```shell
    poetry run kg test-query --data-dir data
    ```

3. **QnA**: Ask questions about the data.
    ```shell
    poetry run kg qna "how many nodes do we have here?" --data-dir data
    ```

4. **Chat**: Start an interactive chat session.
    ```shell
    poetry run kg chat --data-dir data
    ```

5. **App**: Deploy a local web application.
    ```shell
    poetry run kg app --data-dir data
    ```

### Visualization
Use `show me` in prompts for KG visualization.

---
### Acknowledgements

This [cookiecutter](https://cookiecutter.readthedocs.io/en/stable/README.html) project was developed from the [monarch-project-template](https://github.com/monarch-initiative/monarch-project-template) template and will be kept up-to-date using [cruft](https://cruft.github.io/cruft/).

