Metadata-Version: 2.1
Name: kscope
Version: 0.11.0
Summary: A user toolkit for analyzing and interfacing with Large Language Models (LLMs)
Home-page: https://github.com/VectorInstitute/kaleidoscope-sdk
Author: ['Vector AI Engineering']
Author-email: ai_engineering@vectorinstitute.ai
License: MIT
Keywords: python nlp machine-learning deep-learning distributed-computing neural-networks tensor llm
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: certifi==2024.7.4
Requires-Dist: charset-normalizer==3.3.2
Requires-Dist: cloudpickle==3.0.0
Requires-Dist: filelock==3.15.4
Requires-Dist: fsspec==2024.6.1
Requires-Dist: idna==3.7
Requires-Dist: Jinja2==3.1.4
Requires-Dist: MarkupSafe==2.1.5
Requires-Dist: mpmath==1.3.0
Requires-Dist: networkx==3.3
Requires-Dist: numpy==2.0.0
Requires-Dist: nvidia-cublas-cu12==12.1.3.1
Requires-Dist: nvidia-cuda-cupti-cu12==12.1.105
Requires-Dist: nvidia-cuda-nvrtc-cu12==12.1.105
Requires-Dist: nvidia-cuda-runtime-cu12==12.1.105
Requires-Dist: nvidia-cudnn-cu12==8.9.2.26
Requires-Dist: nvidia-cufft-cu12==11.0.2.54
Requires-Dist: nvidia-curand-cu12==10.3.2.106
Requires-Dist: nvidia-cusolver-cu12==11.4.5.107
Requires-Dist: nvidia-cusparse-cu12==12.1.0.106
Requires-Dist: nvidia-nccl-cu12==2.20.5
Requires-Dist: nvidia-nvjitlink-cu12==12.5.82
Requires-Dist: nvidia-nvtx-cu12==12.1.105
Requires-Dist: requests==2.32.3
Requires-Dist: sympy==1.13.0
Requires-Dist: torch==2.3.1
Requires-Dist: triton==2.3.1
Requires-Dist: typing_extensions==4.12.2
Requires-Dist: urllib3==2.2.2

![Kaleidoscope](https://user-images.githubusercontent.com/72175053/229659396-2a61cd69-eafa-4a96-8e1c-d93519a8f617.png)
-----------------
# Kaleidoscope-SDK
![PyPI](https://img.shields.io/pypi/v/kscope)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/kscope)
![GitHub](https://img.shields.io/github/license/VectorInstitute/kaleidoscope-sdk)
![DOI](https://img.shields.io/badge/DOI-in--progress-blue)
[![Documentation](https://img.shields.io/badge/api-reference-lightgrey.svg)](https://kaleidoscope-sdk.readthedocs.io/en/latest/)

A user toolkit for analyzing and interfacing with Large Language Models (LLMs)


## Overview

``kaleidoscope-sdk`` is a Python module used to interact with large language models
hosted via the Kaleidoscope service available at: https://github.com/VectorInstitute/kaleidoscope.
It provides a simple interface to launch LLMs on an HPC cluster and perform basic, fast inference.
These features are exposed via a few high-level APIs, namely:

* `model_instances` - Shows a list of all active LLMs instantiated by the model service
* `load_model` - Loads an LLM via the model service
* `generate` - Returns an LLM text generation based on prompt input, or list of inputs



## Getting Started

Requires Python version >= 3.8

### Install

```bash
python3 -m pip install kscope
```
or install from source:

```bash
pip install git+https://github.com/VectorInstitute/kaleidoscope-sdk.git
```

### Authentication

In order to submit generation jobs, a designated Vector Institute cluster account is required. Please contact the
[AI Engineering Team](mailto:ai_engineering@vectorinstitute.ai?subject=[Github]%20Kaleidoscope)
in charge of Kaleidoscope for more information.

### Sample Workflow

The following workflow shows how to load and interact with an OPT-175B model
on the Vector Institute Vaughan cluster.

```python
#!/usr/bin/env python3
import kscope
import time

# Establish a client connection to the Kaleidoscope service
# If you have not previously authenticated with the service, you will be prompted to now
client = kscope.Client(gateway_host="llm.cluster.local", gateway_port=3001)

# See which models are supported
client.models

# See which models are instantiated and available to use
client.model_instances

# Get a handle to a model. If this model is not actively running, it will get launched in the background.
# In this example we want to use the Llama3 8b model
llama3_model = client.load_model("llama3-8b")

# If the model was not actively running, this it could take several minutes to load. Wait for it come online.
while llama3_model.state != "ACTIVE":
    time.sleep(1)

# Sample text generation w/ input parameters
text_gen = llama3_model.generate("What is Vector Institute?", {'max_tokens': 5, 'top_k': 4, 'temperature': 0.5})
dir(text_gen) # display methods associated with generated text object
text_gen.generation['sequences'] # display only text
text_gen.generation['logprobs'] # display logprobs
text_gen.generation['tokens'] # display tokens

```

## Documentation
Full documentation and API reference are available at: http://kaleidoscope-sdk.readthedocs.io.


## Contributing
Contributing to kaleidoscope is welcomed. See [Contributing](CONTRIBUTING) for
guidelines.


## License
[MIT](LICENSE)


## Citation
Reference to cite when you use Kaleidoscope in a project or a research paper:
```
Willes, J., Choi, M., Coatsworth, M., Shen, G., & Sivaloganathan, J (2022). Kaleidoscope. http://VectorInstitute.github.io/kaleidoscope. computer software, Vector Institute for Artificial Intelligence. Retrieved from https://github.com/VectorInstitute/kaleidoscope-sdk.git.
```
