Metadata-Version: 2.4
Name: steadytext
Version: 0.2.2
Summary: Deterministic text generation and embedding with zero configuration
Project-URL: Homepage, https://github.com/julep-ai/steadytext
Project-URL: Bug Tracker, https://github.com/julep-ai/steadytext/issues
Author-email: SteadyText Community <singh@diwank.name>
License: MIT License
        
        Copyright (c) 2025 SteadyText Community
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: <3.14,>=3.10
Requires-Dist: click>=8.0.0
Requires-Dist: huggingface-hub>=0.32.0
Requires-Dist: llama-cpp-python>=0.3.9
Requires-Dist: numpy<2.0.0,>=1.21.0
Requires-Dist: requests
Requires-Dist: tqdm>=4
Provides-Extra: test
Requires-Dist: pytest; extra == 'test'
Requires-Dist: pytest-cov; extra == 'test'
Requires-Dist: pytest-xdist; extra == 'test'
Description-Content-Type: text/markdown

# SteadyText

*Deterministic text generation and embeddings with zero configuration*

[![](https://img.shields.io/pypi/v/steadytext.svg)](https://pypi.org/project/steadytext/)
[![](https://img.shields.io/pypi/pyversions/steadytext.svg)](https://pypi.org/project/steadytext/)
[![](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)

**Same input → same output. Every time.**
No more flaky tests, unpredictable CLI tools, or inconsistent docs. SteadyText makes AI outputs as reliable as hash functions.

> [!TIP]
> ✨ _Powered by open-source AI workflows from [**Julep**](https://julep.ai)._ ✨

---

## 🚀 Quick Start

```bash
pip install steadytext
```

```python
import steadytext

# Deterministic text generation
code = steadytext.generate("implement binary search in Python")
assert "def binary_search" in code  # Always passes!

# Streaming (also deterministic)
for token in steadytext.generate_iter("explain quantum computing"):
    print(token, end="", flush=True)

# Deterministic embeddings
vec = steadytext.embed("Hello world")  # 1024-dim numpy array
```

---

## 🔧 How It Works

SteadyText achieves determinism via:

* **Fixed seeds:** Constant randomness seed (`42`)
* **Greedy decoding:** Always chooses highest-probability token
* **Frecency cache:** LRU cache with frequency counting—popular prompts stay cached longer
* **Quantized models:** 8-bit quantization ensures identical results across platforms

This means `generate("hello")` returns the exact same 512 tokens on any machine, every single time.

---

## 📦 Installation & Models

Install stable release:

```bash
pip install steadytext
```

#### Models

**Corresponding to pypi versions `0.x.y`**:

* Generation: `BitCPM4-1B-Q8_0` (1.3GB)
* Embeddings: `Qwen3-0.6B-Q8_0` (610MB)

> Each major version will use a fixed set of models only, so that only forced upgrades from pip will change the models (and the deterministic output)

---

## 🎯 Examples

Use SteadyText in tests or CLI tools for consistent, reproducible results:

```python
# Testing with reliable assertions
def test_ai_function():
    result = my_ai_function("test input")
    expected = steadytext.generate("expected output for 'test input'")
    assert result == expected  # No flakes!

# CLI tools with consistent outputs
import click

@click.command()
def ai_tool(prompt):
    print(steadytext.generate(prompt))
```

📂 **[More examples →](examples/)**

---

## 🖥️ CLI Usage

```bash
# Generate text
st "write a hello world function"

# Stream output
st "explain recursion" --stream

# Get embeddings
st embed "machine learning"

# Preload models
st models --preload
```

---

## 📋 When to Use SteadyText

✅ **Perfect for:**

* Testing AI features (reliable asserts)
* Deterministic CLI tooling
* Reproducible documentation & demos
* Offline/dev/staging environments
* Semantic caching and embedding search

❌ **Not ideal for:**

* Creative or conversational tasks
* Latest knowledge queries
* Large-scale chatbot deployments

---

## 🔍 API Overview

```python
# Text generation
steadytext.generate(prompt: str) -> str
steadytext.generate(prompt, return_logprobs=True)

# Streaming generation
steadytext.generate_iter(prompt: str)

# Embeddings
steadytext.embed(text: str | List[str]) -> np.ndarray

# Model preloading
steadytext.preload_models(verbose=True)
```

📚 [Full API Documentation](docs/api.md)

---

## 🔧 Configuration

Control caching behavior via environment variables:

```bash
# Generation cache (default: 256 entries, 50MB)
export STEADYTEXT_GENERATION_CACHE_CAPACITY=256
export STEADYTEXT_GENERATION_CACHE_MAX_SIZE_MB=50

# Embedding cache (default: 512 entries, 100MB)
export STEADYTEXT_EMBEDDING_CACHE_CAPACITY=512
export STEADYTEXT_EMBEDDING_CACHE_MAX_SIZE_MB=100
```

---

## 📖 API Reference

### Text Generation

#### `generate(prompt: str, return_logprobs: bool = False) -> Union[str, Tuple[str, Optional[Dict]]]`

Generate deterministic text from a prompt.

```python
text = steadytext.generate("Write a haiku about Python")

# With log probabilities
text, logprobs = steadytext.generate("Explain AI", return_logprobs=True)
```

- **Parameters:**
  - `prompt`: Input text to generate from
  - `return_logprobs`: If True, returns tuple of (text, logprobs)
- **Returns:** Generated text string, or tuple if `return_logprobs=True`

#### `generate_iter(prompt: str) -> Iterator[str]`

Generate text iteratively, yielding tokens as they are produced.

```python
for token in steadytext.generate_iter("Tell me a story"):
    print(token, end="", flush=True)
```

- **Parameters:**
  - `prompt`: Input text to generate from
- **Yields:** Text tokens/words as they are generated

### Embeddings

#### `embed(text_input: Union[str, List[str]]) -> np.ndarray`

Create deterministic embeddings for text input.

```python
# Single string
vec = steadytext.embed("Hello world")

# List of strings (averaged)
vecs = steadytext.embed(["Hello", "world"])
```

- **Parameters:**
  - `text_input`: String or list of strings to embed
- **Returns:** 1024-dimensional L2-normalized numpy array (float32)

### Utilities

#### `preload_models(verbose: bool = False) -> None`

Preload models before first use.

```python
steadytext.preload_models()  # Silent
steadytext.preload_models(verbose=True)  # With progress
```

#### `get_model_cache_dir() -> str`

Get the path to the model cache directory.

```python
cache_dir = steadytext.get_model_cache_dir()
print(f"Models are stored in: {cache_dir}")
```

### Constants

```python
steadytext.DEFAULT_SEED  # 42
steadytext.GENERATION_MAX_NEW_TOKENS  # 512
steadytext.EMBEDDING_DIMENSION  # 1024
```

---

## 🤝 Contributing

Contributions are welcome!
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

---

## 📄 License

* **Code:** MIT
* **Models:** MIT (BitCPM4, Qwen3)

---

Built with ❤️ for developers tired of flaky AI tests.