Metadata-Version: 2.1
Name: kosmosg
Version: 0.0.4
Summary: kosmosg - Pytorch
Home-page: https://github.com/kyegomez/KosmosG
License: MIT
Keywords: artificial intelligence,deep learning,optimizers,Prompt Engineering
Author: Kye Gomez
Author-email: kye@apac.ai
Requires-Python: >=3.6,<4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.6
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Dist: einops
Requires-Dist: torch
Requires-Dist: torchscale
Requires-Dist: zetascale
Project-URL: Repository, https://github.com/kyegomez/KosmosG
Description-Content-Type: text/markdown

[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# KosmosG
My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"

## Installation
`pip install kosmosg`

## Usage
```python
import torch
from kosmosg.main import KosmosG

# usage
img = torch.randn(1, 3, 256, 256)
text = torch.randint(0, 20000, (1, 1024))

model = KosmosG()
output = model(img, text)
print(output)
```

## Architecture
`text, image => KosmosG => text tokens with multi modality understanding`

## License
MIT

## Todo
- Create Aligner in pytorch
- Create Diffusion module
- Integrate these pieces
- Create a training script
