Metadata-Version: 2.4
Name: iterative-bert
Version: 0.1.0
Summary: Iterative refinement BERT encoder based on Tiny Recursive Models
Project-URL: Repository, https://github.com/paul-english/tiny_ner
Author: Paul English
License: Apache-2.0
License-File: LICENSE
Keywords: bert,encoder,iterative-refinement,nlp,transformer
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: Linguistic
Requires-Python: >=3.10
Requires-Dist: einops>=0.8.1
Requires-Dist: safetensors
Requires-Dist: sentencepiece>=0.2.1
Requires-Dist: tokenizers>=0.22.1
Requires-Dist: torch
Requires-Dist: transformers>=4.57.3
Description-Content-Type: text/markdown

# iterative-bert

An iterative refinement BERT encoder based on [Tiny Recursive Models](https://arxiv.org/html/2510.04871v1).

## Installation

```bash
pip install iterative-bert
```

## Usage

### Load a pre-trained encoder from HuggingFace Hub

```python
from iterative_bert.model import IterativeBert

# Load encoder from HuggingFace Hub
encoder = IterativeBert.from_pretrained("your-username/your-model")

# Run inference
import torch
input_ids = torch.tensor([[101, 2054, 2003, 2023, 102]])  # Example tokens
attention_mask = torch.ones_like(input_ids)
outputs = encoder(input_ids, attention_mask=attention_mask)
hidden_states = outputs.last_hidden_state
```

### Create a new encoder from config

```python
from iterative_bert.model import IterativeBert, IterativeBertConfig

config = IterativeBertConfig(
    vocab_size=30522,
    hidden_size=768,
    num_hidden_layers=1,
    num_attention_heads=12,
    intermediate_size=3072,
    h_cycles=1,
    l_cycles=8,
    use_rope=True,
)

encoder = IterativeBert(config)
```

## Features

- **Iterative Refinement**: Applies transformer layers multiple times with residual connections
- **RoPE Support**: Rotary Position Embeddings for better length generalization
- **Flash Attention**: Optional Flash Attention 2/3 support for efficiency
- **HuggingFace Compatible**: Works with `from_pretrained` and `save_pretrained`

## License

Apache 2.0
