Metadata-Version: 2.4
Name: beeroot
Version: 1.0.1
Summary: Stop babysitting your LLM API calls. Adaptive rate gate + multi-agent pipeline.
Home-page: https://github.com/carlex22/beeROOT
Author: carlex22
Keywords: llm,rate-limiting,backpressure,multi-agent,openai,groq,openrouter,workflow,yaml,batch-processing,distributed,pipeline
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.31.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: pandas>=2.0
Requires-Dist: GitPython>=3.1.41
Provides-Extra: groq
Requires-Dist: groq>=0.4.2; extra == "groq"
Provides-Extra: server
Requires-Dist: fastapi>=0.109.0; extra == "server"
Requires-Dist: uvicorn[standard]>=0.27.0; extra == "server"
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# 🐝 beeROOT

**Stop babysitting your LLM API calls.**

You're processing 10,000 documents. You fire off parallel requests.  
Then the 429s start. You add sleep(). It gets worse.  
You add a queue. Now you're maintaining infrastructure.

beeROOT handles all of that in 3 lines.

```python
from beeroot import Balancer, Flow

balancer = Balancer.from_yaml("config.yaml")   # adaptive rate gate
flow     = Flow.from_yaml("workflow.yaml",     # multi-agent pipeline
                           balancer=balancer)

result, stats = flow.run(my_document)          # just works
```

```
pip install beeroot
```

---

## The Problem

Every LLM batch job hits the same wall:

```
429: Rate limit exceeded
429: Rate limit exceeded  
429: Rate limit exceeded
```

The usual fixes — sleep, retry, backoff — don't scale.  
They're static. The API is dynamic.

beeROOT listens to the API and adapts in real time.

---

## How It Works

```
    You send 1000 documents
           ↓
    ┌──────────────────────────────┐
    │          GATE                │  ← 1 call starts at a time
    │  sequential · adaptive delay │    delay = f(recent 429s)
    └──────────────┬───────────────┘
                   ↓ green light
    ┌──────────────────────────────┐
    │         PASTURE              │  ← unlimited concurrent calls
    │  all active calls go here    │    self-drains naturally
    └──────────────────────────────┘
                   ↓
         API gets steady flow
         zero 429s, full speed
```

**The gate** serializes *starts* — only one call begins at a time.  
**The pressure** tracks 429/498/timeout errors in a sliding window.  
**The delay** grows when errors accumulate, shrinks when calls succeed.

No config. No tuning. The API teaches beeROOT its own limits.

**Multiple instances self-balance** — run 10 workers on 10 machines,  
they all read the same API signals and converge automatically.  
No Redis. No shared state. No coordination layer.

---

## The Four Modules

Each module is **standalone**. Use one, use all, mix freely.

### 🎛️ Balancer — adaptive rate gate

```python
from beeroot import Balancer

balancer = Balancer.from_yaml("config.yaml")

# Handles rate limiting automatically
result = balancer.call([
    {"role": "user", "content": "Summarize: " + document}
])
print(result.content)
```

The gate serializes call starts. The pasture runs them concurrently.  
No 429s. No manual tuning. Plug in your API key and go.

---

### 🔀 Flow — YAML-driven multi-agent pipeline

```python
from beeroot import Flow, Balancer

balancer = Balancer.from_yaml("config.yaml")
flow     = Flow.from_yaml("workflow.yaml", balancer=balancer)

result, stats = flow.run({"id": "doc-001", "text": my_document})
# stats.api_calls, stats.total_tokens, stats.loop_count
```

Your agent logic lives in YAML — not in Python.  
Change the workflow without touching code.

```yaml
workflow:
  tasks:
    - id: ANALYZE
      task_type: reasoning
      prompt: "Extract key entities and relationships."
      transitions:
        - target: SERIALIZE
          condition: "not phase_failed"
        - target: END_FAILURE

    - id: SERIALIZE
      task_type: final
      prompt: "Output as JSON."
      transitions:
        - target: END_SUCCESS
          condition: "json_valid"
        - target: END_FAILURE
```

Flow also handles **loop detection** — if the model starts repeating itself,  
it downgrades reasoning effort and passes the prior output as context.  
Your pipeline never gets stuck in an infinite loop.

---

### 📦 Chunks — Git-backed batch storage

```python
from beeroot import Chunks

chunks = Chunks.from_yaml("config.yaml")

for chunk_id, records in chunks.iter_pending():
    chunks.mark_started(chunk_id)
    results = [flow.run(r)[0] for r in records]
    chunks.write(results, chunk_id)

chunks.stop()  # flushes Git pushes
```

Store your data in any Git repo (GitHub, HuggingFace Datasets).  
Input: `chunk_input_*.tar.gz` → Output: `chunk_output_*.tar.gz`.  
Writes are batched to respect push rate limits (HF: 30/min).

---

### 📁 Endpoint — local/git data I/O

```python
from beeroot import Endpoint

ep = Endpoint.from_yaml("config.yaml")

for batch in ep.iter_batches(size=100):
    results = [flow.run(r)[0] for r in batch]
    ep.write(results)
```

Reads and writes `.jsonl`, `.json`, `.tar.gz`.  
Local filesystem or Git-backed. Zero processing logic.  
The clean boundary between your data and your pipeline.

---

## Real Numbers

Tested processing 500 legal documents through a 3-step LLM pipeline:

| | Before beeROOT | With beeROOT |
|---|---|---|
| Workers | 598 | ~50 |
| 429/min | 50+ | **0** |
| Spawn pressure | 120s | ~0s |
| Throughput | chaotic | **528 docs/min** |

Same API quota. 11× fewer workers. Zero rate limit errors.

---

## Install

```bash
pip install beeroot
```

Dependencies: `requests`, `pyyaml`, `pandas`, `GitPython`  
Python 3.10+

---

## Config Reference

```yaml
# config.yaml

balancer:
  provider:     openrouter       # openrouter | groq
  api_key:      ${API_KEY}       # reads from env var
  model:        openai/gpt-4o
  delay_factor: 1.0              # stagger multiplier
  timeout:      300
  params:
    max_completion_tokens: 4000

chunks:
  git_token:    ${GIT_TOKEN}
  repo_slug:    your-org/your-data-repo
  input_dir:    chunks_input
  output_dir:   chunks_output
  max_push_rpm: 25               # HuggingFace: 30/min

endpoint:
  input:
    path: ./data/input.jsonl
  output:
    path: ./data/output.jsonl
    mode: append                 # append | overwrite
```

---

## Roadmap

Things we're building next — contributions welcome.

**v1.1 — Resilience**
- [ ] Retry policies configurable per task in workflow YAML
- [ ] Dead letter queue for permanently failed records
- [ ] Checkpoint/resume — pick up where you left off after a crash

**v1.2 — Providers**
- [ ] Anthropic Claude adapter
- [ ] OpenAI native adapter  
- [ ] Async support (`await balancer.acall(...)`)
- [ ] Streaming responses

**v1.3 — Observability**
- [ ] Prometheus metrics export
- [ ] OpenTelemetry tracing per document
- [ ] Real-time dashboard (steal from the pasture 🐄)

**v1.4 — Scale**
- [ ] Distributed gate via Redis (optional, for 100+ worker deployments)
- [ ] GitBatcher with smart deduplication
- [ ] HuggingFace Datasets native adapter (beyond tar.gz)

**v2.0 — YAML-first everything**
- [ ] Input schema validation in YAML
- [ ] Output schema validation in YAML  
- [ ] Multi-provider routing per task (route STEP1 to Groq, STEP2 to Claude)
- [ ] Cost estimation before running

---

## License

MIT — use it, fork it, build on it.

---

*Built to process 500,000+ documents without a single 429.*  
*The cows are in the pasture. 🐄*
