Metadata-Version: 2.1
Name: platform_gen_ai
Version: 0.1.0
Summary: This is pipeline code for accelerating solution accelerators
Author: Google LLC
Author-email: chertushkin@google.com
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
License-File: LICENSE
Requires-Dist: aiofiles==23.2.1
Requires-Dist: aiohttp==3.9.5
Requires-Dist: aiosignal==1.3.1
Requires-Dist: altair==5.3.0
Requires-Dist: annotated-types==0.6.0
Requires-Dist: anyio==4.3.0
Requires-Dist: asgiref==3.8.1
Requires-Dist: attrs==23.2.0
Requires-Dist: backoff==2.2.1
Requires-Dist: bcrypt==4.1.2
Requires-Dist: beautifulsoup4==4.12.3
Requires-Dist: build==1.2.1
Requires-Dist: cachetools==5.3.3
Requires-Dist: camelot-py==0.11.0
Requires-Dist: certifi==2024.2.2
Requires-Dist: cffi==1.16.0
Requires-Dist: chardet==5.2.0
Requires-Dist: charset-normalizer==3.3.2
Requires-Dist: chroma-hnswlib==0.7.3
Requires-Dist: chromadb==0.4.24
Requires-Dist: click==8.1.7
Requires-Dist: colorama==0.4.6
Requires-Dist: coloredlogs==15.0.1
Requires-Dist: contourpy==1.2.1
Requires-Dist: cryptography==42.0.5
Requires-Dist: cycler==0.12.1
Requires-Dist: dataclasses-json==0.6.4
Requires-Dist: dataclasses-json-speakeasy==0.5.11
Requires-Dist: decorator==5.1.1
Requires-Dist: dependency-injector==4.41.0
Requires-Dist: Deprecated==1.2.14
Requires-Dist: distro==1.9.0
Requires-Dist: docstring_parser==0.16
Requires-Dist: emoji==2.11.0
Requires-Dist: et-xmlfile==1.1.0
Requires-Dist: fastapi==0.110.1
Requires-Dist: ffmpy==0.3.2
Requires-Dist: filelock==3.13.4
Requires-Dist: filetype==1.2.0
Requires-Dist: flatbuffers==24.3.25
Requires-Dist: fonttools==4.51.0
Requires-Dist: frozenlist==1.4.1
Requires-Dist: fsspec==2024.3.1
Requires-Dist: gcsfs==2024.3.1
Requires-Dist: GitPython==3.1.43
Requires-Dist: google-ai-generativelanguage==0.6.2
Requires-Dist: google-api-core==2.18.0
Requires-Dist: google-api-python-client==2.126.0
Requires-Dist: google-auth==2.29.0
Requires-Dist: google-auth-httplib2==0.2.0
Requires-Dist: google-auth-oauthlib==1.2.0
Requires-Dist: google-cloud-aiplatform==1.47.0
Requires-Dist: google-cloud-bigquery==3.20.1
Requires-Dist: google-cloud-core==2.4.1
Requires-Dist: google-cloud-documentai==2.25.0
Requires-Dist: google-cloud-resource-manager==1.12.3
Requires-Dist: google-cloud-storage==2.16.0
Requires-Dist: google-crc32c==1.5.0
Requires-Dist: google-generativeai==0.5.1
Requires-Dist: google-resumable-media==2.7.0
Requires-Dist: googleapis-common-protos==1.63.0
Requires-Dist: gradio==3.50.2
Requires-Dist: greenlet==3.0.3
Requires-Dist: grpc-google-iam-v1==0.13.0
Requires-Dist: grpcio==1.62.1
Requires-Dist: grpcio-status==1.62.1
Requires-Dist: h11==0.14.0
Requires-Dist: httpcore==1.0.5
Requires-Dist: httplib2==0.22.0
Requires-Dist: httptools==0.6.1
Requires-Dist: httpx==0.27.0
Requires-Dist: huggingface-hub==0.22.2
Requires-Dist: humanfriendly==10.0
Requires-Dist: idna==3.7
Requires-Dist: importlib-metadata==7.0.0
Requires-Dist: importlib_resources==6.4.0
Requires-Dist: iniconfig==2.0.0
Requires-Dist: Jinja2==3.1.3
Requires-Dist: joblib==1.4.0
Requires-Dist: json5==0.9.25
Requires-Dist: jsonpatch==1.33
Requires-Dist: jsonpath-python==1.0.6
Requires-Dist: jsonpointer==2.4
Requires-Dist: jsonschema==4.21.1
Requires-Dist: jsonschema-specifications==2023.12.1
Requires-Dist: kiwisolver==1.4.5
Requires-Dist: kubernetes==29.0.0
Requires-Dist: langchain==0.1.16
Requires-Dist: langchain-community==0.0.33
Requires-Dist: langchain-core==0.1.43
Requires-Dist: langchain-openai==0.1.3
Requires-Dist: langchain-text-splitters==0.0.1
Requires-Dist: langchainhub==0.1.15
Requires-Dist: langdetect==1.0.9
Requires-Dist: langsmith==0.1.48
Requires-Dist: lxml==5.2.1
Requires-Dist: markdown-it-py==3.0.0
Requires-Dist: markdownify==0.12.1
Requires-Dist: MarkupSafe==2.1.5
Requires-Dist: marshmallow==3.21.1
Requires-Dist: matplotlib==3.8.4
Requires-Dist: mdurl==0.1.2
Requires-Dist: mmh3==4.1.0
Requires-Dist: monotonic==1.6
Requires-Dist: mpmath==1.3.0
Requires-Dist: multidict==6.0.5
Requires-Dist: mypy-extensions==1.0.0
Requires-Dist: nltk==3.8.1
Requires-Dist: numpy==1.26.4
Requires-Dist: oauthlib==3.2.2
Requires-Dist: onnxruntime==1.17.3
Requires-Dist: openai==1.20.0
Requires-Dist: opencv-python==4.9.0.80
Requires-Dist: openpyxl==3.1.2
Requires-Dist: opentelemetry-api==1.24.0
Requires-Dist: opentelemetry-exporter-otlp-proto-common==1.24.0
Requires-Dist: opentelemetry-exporter-otlp-proto-grpc==1.24.0
Requires-Dist: opentelemetry-instrumentation==0.45b0
Requires-Dist: opentelemetry-instrumentation-asgi==0.45b0
Requires-Dist: opentelemetry-instrumentation-fastapi==0.45b0
Requires-Dist: opentelemetry-proto==1.24.0
Requires-Dist: opentelemetry-sdk==1.24.0
Requires-Dist: opentelemetry-semantic-conventions==0.45b0
Requires-Dist: opentelemetry-util-http==0.45b0
Requires-Dist: orjson==3.10.1
Requires-Dist: overrides==7.7.0
Requires-Dist: packaging==23.2
Requires-Dist: pandas==2.2.2
Requires-Dist: pdf2image==1.17.0
Requires-Dist: pdfminer.six==20231228
Requires-Dist: pillow==10.3.0
Requires-Dist: pluggy==1.4.0
Requires-Dist: posthog==3.5.0
Requires-Dist: proto-plus==1.23.0
Requires-Dist: protobuf==4.25.3
Requires-Dist: pulsar-client==3.5.0
Requires-Dist: pyarrow==15.0.2
Requires-Dist: pyasn1==0.6.0
Requires-Dist: pyasn1_modules==0.4.0
Requires-Dist: pycparser==2.22
Requires-Dist: pydantic==2.7.0
Requires-Dist: pydantic_core==2.18.1
Requires-Dist: pydub==0.25.1
Requires-Dist: Pygments==2.17.2
Requires-Dist: pyparsing==3.1.2
Requires-Dist: pypdf==4.2.0
Requires-Dist: PyPDF2==3.0.1
Requires-Dist: PyPika==0.48.9
Requires-Dist: pyproject_hooks==1.0.0
Requires-Dist: pysqlite3-binary==0.5.2.post3
Requires-Dist: pytest==8.1.1
Requires-Dist: python-dateutil==2.9.0.post0
Requires-Dist: python-docx==1.1.0
Requires-Dist: python-dotenv==1.0.1
Requires-Dist: python-iso639==2024.2.7
Requires-Dist: python-magic==0.4.27
Requires-Dist: python-multipart==0.0.9
Requires-Dist: pytz==2024.1
Requires-Dist: PyYAML==6.0.1
Requires-Dist: rapidfuzz==3.8.1
Requires-Dist: redis==5.0.3
Requires-Dist: referencing==0.34.0
Requires-Dist: regex==2024.4.16
Requires-Dist: requests==2.31.0
Requires-Dist: requests-oauthlib==2.0.0
Requires-Dist: rich==13.7.1
Requires-Dist: rpds-py==0.18.0
Requires-Dist: rsa==4.9
Requires-Dist: ruff==0.3.7
Requires-Dist: safetensors==0.4.3
Requires-Dist: semantic-version==2.10.0
Requires-Dist: shapely==2.0.3
Requires-Dist: shellingham==1.5.4
Requires-Dist: six==1.16.0
Requires-Dist: sniffio==1.3.1
Requires-Dist: soupsieve==2.5
Requires-Dist: SQLAlchemy==2.0.29
Requires-Dist: starlette==0.37.2
Requires-Dist: sympy==1.12
Requires-Dist: tabulate==0.9.0
Requires-Dist: tenacity==8.2.3
Requires-Dist: tiktoken==0.6.0
Requires-Dist: tokenizers==0.15.2
Requires-Dist: tomlkit==0.12.0
Requires-Dist: toolz==0.12.1
Requires-Dist: tqdm==4.66.2
Requires-Dist: transformers==4.39.3
Requires-Dist: typer==0.12.3
Requires-Dist: types-requests==2.31.0.20240406
Requires-Dist: typing-inspect==0.9.0
Requires-Dist: typing_extensions==4.11.0
Requires-Dist: tzdata==2024.1
Requires-Dist: unstructured==0.13.2
Requires-Dist: unstructured-client==0.18.0
Requires-Dist: uritemplate==4.1.1
Requires-Dist: urllib3==2.2.1
Requires-Dist: uvicorn==0.29.0
Requires-Dist: uvloop==0.19.0
Requires-Dist: watchfiles==0.21.0
Requires-Dist: websocket-client==1.7.0
Requires-Dist: websockets==11.0.3
Requires-Dist: wrapt==1.16.0
Requires-Dist: yarl==1.9.4
Requires-Dist: zipp==3.18.1

# Solution Accelerators for GenAI
This repository contains platform code for accelerating development of GenAI solutions in Applied AI Engineering team

![alt text](resources/image.png)

# Structure

- **docs**: This directory contains documentation, user guides, and any other resources that help you understand and use the GenAI solution accelerators effectively.

- **src**: The source code for the GenAI solution accelerators is located here. This is where you'll find the core codebase for the tools and frameworks provided in this repository.

- **data**: This directory may contain sample data or data-related resources that can be used for testing and development.

- **tests**: Test cases and resources related to testing the GenAI solution accelerators are stored in this directory.

- **scripts**: Utility scripts or automation scripts that can assist in various tasks related to GenAI development and deployment.

- **examples**: This directory may contain example projects, code snippets, or reference implementations that showcase how to use the provided solution accelerators effectively.

## Getting Started

To get started with the GenAI solution accelerators, follow the instructions in the documentation located in the `docs` directory. It will provide you with step-by-step guidance on how to set up your development environment and use the tools and frameworks provided in this repository.

## Contribution Guidelines

We welcome contributions from the GenAI community! If you'd like to contribute to this repository, please follow our [Contribution Guidelines](CONTRIBUTING.md) to ensure a smooth collaboration process.

## License

This repository is licensed under the [Apaache License](LICENSE). See the [LICENSE](LICENSE) file for details.

## Contact

If you have any questions or need assistance, feel free to reach out to the project maintainers or create an issue in this repository.

Happy GenAI development!


## Setting up
To begin development you can use 2 different approaches: using Python Environment or using Docker. Below are instructions for each approach.

### Setting up Python Environment
Make sure to install miniconda environment:
```
cd ~/
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
~/miniconda3/bin/conda init bash
```
After that just install the package in editable mode:

```
pip install -e .
```

### Setting up Docker
If this is your first time, you probably don't have Docker installed on VM. Execute the following commands:
```
sudo apt update && sudo apt upgrade
sudo apt install make
sudo apt install docker.io
sudo groupadd docker
sudo usermod -aG docker $USER
sudo chmod 777 /var/run/docker.sock
```

### Setting up environment

```
make build && make container
```

If you want to remove it, execute:

```
make clean
```


### Copying resources

Make sure `gcs_source_bucket` field in `llm.yaml` is up to date with the latest extraction in use. Then run the copying python script:
```
python gen_ai/copy_resources.py
```


### Updating BigQuery table

It is currently set up that all the runs are logged into "uhg" dataset in "chertushkin-genai-sa" project. To change the project id - change `bq_project_id` field of `llm.yaml` file. If you receive an error in logging, check if the service account is added to BigQuery IAM of "chertushkin-genai-sa" project. Or whatever the project you specified in the config.
