Metadata-Version: 2.4
Name: universeos
Version: 0.1.3
Summary: High-Performance Traffic Shadowing & Experimentation Platform
Home-page: https://github.com/universeos/universeos
Author: UniverseOS Team
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.25.0
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# UniverseOS
**High-Performance Traffic Shadowing & Experimentation Platform**

> **Status: Alpha** – Suitable for staging / non-critical use.
> **Ideal Users:** Infra / Platform teams who want to shadow new models or prompts on real traffic.

UniverseOS is a specialized infrastructure layer designed for safe, real-time experimentation with production traffic. While currently optimized for Large Language Models (LLMs), its core architecture provides a generic, high-concurrency shadowing gateway that enables "Parallel Universe" testing—allowing engineers to validate new models, prompts, or configurations against live production data with zero impact on end-user latency or reliability.

## Core Architecture

UniverseOS is built on a split-plane architecture designed for scale and resilience:

*   **Data Plane (C++ Gateway)**: A high-performance, non-blocking reverse proxy written in C++. It utilizes asynchronous I/O to handle high throughput with minimal overhead. The Gateway intercepts incoming requests and mirrors them to shadow backends asynchronously, ensuring the primary response path remains unaffected.
*   **Control Plane (Registry & Policy)**: A dynamic service discovery and routing engine. It allows for hot-swapping of shadow models and granular traffic routing policies (e.g., "shadow 10% of traffic to Model B") without restarting the gateway.
*   **Observability Plane**: A dedicated metrics ingestion pipeline that captures latency, throughput, and model-specific telemetry (e.g., token usage) for side-by-side performance comparison.

## Key Features

*   **Zero-Latency Shadowing**: The shadowing mechanism is completely decoupled from the primary request path. Shadow responses are processed out-of-band, guaranteeing no latency penalty for the end user.
*   **Streaming Support**: For streaming responses (e.g., LLM tokens), the Gateway forwards the primary stream to the client in real time while consuming the shadow stream in the background and logging it, without impacting user latency.
*   **Language Agnostic Integration**: Designed as a network-level infrastructure component. While a Python SDK is provided for convenience, the system works with any client capable of making HTTP requests.
*   **Production Safety**: Failures in shadow models are isolated. If a shadow model crashes or times out, the primary request completes successfully, and the error is logged for analysis.

## Quickstart

### 1. Deploy the Control Plane
The core infrastructure runs as a set of containerized services.

```bash
docker-compose up -d
```

### 2. Configure Your Universes
Create a `universe.yaml` file to define your primary and shadow models:

```yaml
primary:
  id: "openai-gpt4"
  provider: "openai"
  model: "gpt-4"

shadow_universes:
  - id: "internal-v2"
    provider: "rest"
    endpoint: "http://internal-llm-v2:8000/chat"

policy:
  sample_rate: 1.0
```

### 3. Install the Python SDK
For Python applications, the SDK provides a seamless integration point.

```bash
pip install universeos
```

### 4. Integration
Wrap your existing API calls to enable automatic shadowing. The SDK handles the communication with the Gateway.

```python
from universeos import universe_shadow, init_universe

# Initialize connection to the sidecar/gateway
init_universe()

@universe_shadow
def generate_response(prompt):
    # After adding this decorator, every call to generate_response()
    # will be mirrored to your configured shadow universes via UniverseOS.
    return production_client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
```

> See `examples/fastapi_demo/` on the [homepage repo](https://github.com/universeos/universeos) for a full end-to-end example (docker-compose + OpenAI GPT-4 + shadow model).

## Technical Specifications

*   **Gateway**: C++17, Asynchronous Socket I/O (poll/epoll)
*   **Protocol**: HTTP/1.1 (Streaming Support)
*   **Configuration**: Dynamic YAML-based policy loading
*   **SDK**: Python 3.7+ (Thread-safe, minimal dependencies)

## Building from Source

To build the high-performance Gateway and Control Plane services from source:

```bash
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j4
```
