Metadata-Version: 2.4
Name: textslinger
Version: 0.2.1
Summary: TextSlinger: Fast and accurate text predictions in Python
Author-email: Keith Vertanen <vertanen@mtu.edu>, Dylan Gaines <dgaine20@kennesaw.edu>, Soufia Bahmani <sbahmani@mtu.edu>
License-Expression: MIT
Project-URL: Homepage, https://github.com/kdv123/textslinger
Project-URL: Source, https://github.com/kdv123/textslinger
Platform: Linux
Platform: Windows
Platform: Mac OS-X
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Scientific/Engineering :: Human Machine Interfaces
Classifier: Natural Language :: English
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: End Users/Desktop
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.10
Requires-Python: <3.11,>=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.6.0
Requires-Dist: torchvision>=0.21.0
Requires-Dist: torchaudio>=2.6.0
Requires-Dist: datasets==2.18.0
Requires-Dist: bitsandbytes==0.42.0
Requires-Dist: requests==2.32.3
Requires-Dist: kenlm==0.2.0
Requires-Dist: nlpaug==1.1.11
Requires-Dist: psutil==5.7.2
Requires-Dist: ipywidgets==8.1.3
Requires-Dist: sentencepiece==0.2.0
Requires-Dist: protobuf==4.25.3
Requires-Dist: evaluate==0.4.0
Requires-Dist: scikit-learn==1.2.2
Requires-Dist: accelerate>=1.1.0
Requires-Dist: transformers>=5.2.0
Requires-Dist: numpy~=1.26.3
Requires-Dist: tqdm==4.62.2
Requires-Dist: peft~=0.14.0
Provides-Extra: dev
Requires-Dist: coverage>=7.0; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Provides-Extra: release
Requires-Dist: twine==5.0.0; extra == "release"
Requires-Dist: build>=1.4.0; extra == "release"
Requires-Dist: wheel==0.43.0; extra == "release"
Dynamic: license-file

<table border="0" cellpadding="0" cellspacing="0">
  <tr>
    <td width="120">
      <a href="images/textslinger.png"><img src="images/textslinger_small.jpg" width="236" alt="Cowboy drawing his two cell phone six shooters"></a>
    </td>
    <td align="center">
      <h2>TextSlinger: Fast and Accurate Text Predictions in Python</h2>
    </td>
  </tr>
</table>

This is a Python library for making text predictions using different types of language models.
Current features:
* Predict the distribution over the next character given the previous text.
* Predict the most likely next words given the previous text and prefix of current word.
* Supports:
  - N-gram language models via [KenLM](https://github.com/kpu/kenlm).
  - Subword tokenized large language models (LLMs) via [Hugging Face](https://huggingface.co/docs/hub/en/index).
  - Byte tokenized LLMs via Hugging Face and [Byte Latent Transformer](https://arxiv.org/abs/2412.09871).

## Developer setup
Our code style is whatever the Black formatter says it should be. 
You should [configure your IDE to format using Black when you save](https://black.readthedocs.io/en/stable/integrations/editors.html).

## Setting up a Python environment
If you don't have [Miniforge](https://conda-forge.org/download/) installed in your user account you'll first need to do that.

To install Miniforge on MacOS using Apple Silicon:
```
curl -LO https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
zsh Miniforge3-MacOSX-arm64.sh
~/miniforge3/bin/conda init zsh
```
To install Miniforge on Linux:
```
curl -LO https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh
```

After installing Miniforge, be sure to close your terminal and start a new one. Create an environment as follows:
```
conda config --remove-key channels
conda config --add channels conda-forge
conda config --set channel_priority strict
conda create -n textslinger python=3.10 -y
conda activate textslinger
```

### Installation of PyTorch
MacOS using Apple Silicon:
```
pip install torch torchvision torchaudio
```

Linux with CUDA support (GPU driver must support installed library version or greater. Run ```nvidia-smi``` to check driver support):
```
# CUDA 11.8 
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# CUDA 12.1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```
Linux without CUDA support:
```
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
```

Test if the PyTorch installation worked:
```
python - <<'EOF'
import torch
print("Torch version:", torch.__version__)
print("MPS available:", torch.backends.mps.is_available())
print("MPS built:", torch.backends.mps.is_built())
print("CUDA available:", torch.cuda.is_available())
print("CUDA version:", torch.version.cuda)
print("CUDA device count:", torch.cuda.device_count())

if torch.backends.mps.is_available():
    x = torch.randn(2, 2, device="mps")
    print("Tensor device:", x.device)
elif torch.cuda.is_available():
    x = torch.randn(2, 2, device="cuda")
    print("Tensor device:", x.device)
else:
    x = torch.randn(2, 2)
    print("Tensor device:", x.device)
EOF
```

### Installation of libraries
Install transformers (5.2.0 or greater required). 
```
pip install transformers
```
Check transformers version and model support:
```
python - <<'EOF'
import transformers
from transformers import __version__
from transformers.utils import is_torch_available

print("Version:", __version__)
print("File:", transformers.__file__)

# Check for BLT symbols that do NOT exist in stable 5.0.0
try:
    from transformers.models.blt.modeling_blt import BltModel
    print("BLT model available ✅")
except Exception as e:
    print("BLT model missing ❌", e)
EOF
```
Install other dependencies:
```
pip install pytest scipy peft psutil datasets

# NOTE: increase MAX_ORDER if you plan to load n-gram models with longer context 
MAX_ORDER=12 pip install https://github.com/kpu/kenlm/archive/master.zip
```
Fix harmless warning message:
```
pip install --upgrade --force-reinstall setuptools
```

### Testing installation
Download assets needed by the test suite and then run it:
```
cd textslinger/assets
./download.sh
cd ..
pytest -v -rs
```

---
This material is based upon work supported by the NSF under Grant No. IIS-1909089 and IIS-2402876.
