Metadata-Version: 2.4
Name: rpy-bridge
Version: 0.5.2
Summary: Python-to-R interoperability engine with environment management, type-safe conversions, data normalization, and safe R function execution.
Author-email: Victoria Cheung <victoriakcheung@gmail.com>
License: MIT License
        
        Copyright (c) 2025 Victoria Cheung
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
        Acknowledgement: This project builds on work originally developed at
        Revolution Medicines and interfaces with the rpy2 project, which is licensed
        under the GNU General Public License version 2 or later.
        
Project-URL: Homepage, https://github.com/vic-cheung/rpy-bridge
Project-URL: Issue Tracker, https://github.com/vic-cheung/rpy-bridge/issues
Keywords: python,r,rpy2,python-r,interoperability,data-science,statistics,bioinformatics
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: loguru>=0.7
Provides-Extra: r
Requires-Dist: rpy2>=3.5; extra == "r"
Provides-Extra: dev
Requires-Dist: ipykernel>=7.1.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx; extra == "docs"
Requires-Dist: myst-parser; extra == "docs"
Dynamic: license-file

# rpy-bridge

**rpy-bridge** is a Python-controlled **R execution orchestrator** (not a thin
rpy2 wrapper). It delivers deterministic, headless-safe R startup; project-root
inference; out-of-tree `renv` activation; isolated script namespaces; and robust
Python↔R conversions with dtype/NA normalization. Use it when you need
reproducible R execution from Python in production pipelines and CI.

**Latest release:** [`rpy-bridge` on PyPI](https://pypi.org/project/rpy-bridge/)

---

## What this is (and is not)

rpy-bridge **is not a thin rpy2 wrapper**. Key differences:

- Infers R project roots via markers (`.git`, `.Rproj`, `renv.lock`, `DESCRIPTION`, `.here`)
- Activates `renv` even when it lives outside the calling directory
- Executes from the inferred project root so relative paths behave as R expects
- Runs headless by default (no GUI probing), isolates scripts from `globalenv()`
- Normalizes return values for Python (NAs, dtypes, data.frames) and offers comparison helpers

---

## Quickstart

Call a package function (no scripts):

```python
from rpy_bridge import RFunctionCaller

rfc = RFunctionCaller()
samples = rfc.call("stats::rnorm", 5, mean=0, sd=1)
median_val = rfc.call("stats::median", samples)
```

Call a function from a local script with `renv` (out-of-tree allowed):

```python
from pathlib import Path
from rpy_bridge import RFunctionCaller

project_dir = Path("/path/to/your-r-project")
script = project_dir / "scripts" / "example.R"

rfc = RFunctionCaller(path_to_renv=project_dir, scripts=script)
result = rfc.call("some_function", 42, named_arg="value")
```

## Core capabilities

### 1. R execution orchestration

- Embeds R via `rpy2` with deterministic startup behavior
- Disables interactive and GUI-dependent hooks for headless execution
- Loads R scripts into isolated namespaces (not `globalenv()`)

### 2. Project root inference and path stability

- Infers R project roots using markers such as:
  `.git`, `.Rproj`, `renv.lock`, `DESCRIPTION`, `.here`
- Executes R code from the inferred project root regardless of Python CWD
- Preserves relative-path behavior expected by R scripts
- Supports R code using `here::here()` or project-local data

### 3. Out-of-tree `renv` activation

- Activates `renv` projects located **outside** the calling Python directory
- Sources `.Rprofile` and `.Renviron` to reproduce R startup semantics
- Does not require R scripts and `renv` to live in the same directory

### 4. Python ↔ R data conversion

- Converts Python scalars, lists, dicts, and pandas objects into R equivalents
- Converts R vectors, lists, and data.frames back into Python-native types
- Handles nested structures, missing values, and mixed types robustly

### 5. Data normalization and diagnostics

- Post-processes R data.frames to fix dtype, timezone, and NA semantics
- Normalizes column types for reliable Python-side comparison
- Supports structured mismatch diagnostics between Python and R data

### 6. Function invocation across scripts and packages

- Calls functions defined in sourced R scripts, base R, or installed packages
- Supports qualified function names (e.g. `stats::median`)
- Executes functions within the active project and library context

---

## Calling base R functions and managing packages

In addition to sourcing local R scripts, rpy-bridge supports calling functions
from base R and installed packages directly from Python.

Current support includes:

- Calling base R functions without a local R script
- Executing functions from installed R packages within the active environment

Planned extensions (roadmap):

- Programmatic installation of R packages into the active `renv` or system
  environment when explicitly enabled
- Declarative package requirements at the Python call site
- Safe, opt-in package installation for CI and ephemeral environments

Package installation is intentionally **not automatic by default** to preserve
reproducibility and avoid side effects during execution.

---

## Installation

### Prerequisites

- System R installed and available on `PATH`
- Python 3.11+ (tested on 3.11–3.12)

### From PyPI

Install rpy-bridge with rpy2 for full R support:

```bash
python3 -m pip install rpy-bridge rpy2
```

Using `uv`:

```bash
uv add rpy-bridge rpy2
```

### Development install

```bash
python3 -m pip install -e .
```

or:

```bash
uv sync
```

### Required Python dependencies

- `rpy2`
- `pandas`
- `numpy`

---

## Usage

See Quickstart above and examples in `examples/basic_usage.py`.

---

## Round-trip Python ↔ R behavior

rpy-bridge attempts to convert Python objects to R and back. Most objects used in
scientific and ML pipelines round-trip cleanly, but some heterogeneous Python
structures may be wrapped or slightly altered due to differences in R’s type
system.

| Python type                                    | Round-trip fidelity | Notes                                                                 |
| ---------------------------------------------- | ------------------- | --------------------------------------------------------------------- |
| `int`, `float`, `bool`, `str`                  | High                | Scalars convert directly                                              |
| Homogeneous `list` of numbers/strings          | High                | Converted to atomic R vectors                                         |
| Nested homogeneous lists                       | High                | Converted to nested R lists                                           |
| `pandas.DataFrame` / `pd.Series`               | High                | Converted to `data.frame` and normalized on return                    |
| Mixed-type `list` or `dict`                    | Partial             | May be wrapped in single-element vectors                              |
| `None` / `pd.NA`                               | High                | Converted to R `NULL`                                                 |

---

## R setup helpers

Helper scripts are provided in `examples/r-deps/` to prepare R environments.

- Install system R dependencies (macOS / Homebrew):

```bash
bash examples/r-deps/install_r_dev_deps_homebrew.sh
```

- Initialize an `renv` project:

```r
source("examples/r-deps/setup_env.R")
```

- Restore the environment on a new machine:

```r
renv::restore()
```

---

## Who this is for

rpy-bridge is designed for:

- Python-first pipelines that rely on mature R code
- Teams where R logic must remain authoritative
- CI or production systems that cannot rely on interactive R sessions
- Multi-repo or multi-directory projects with non-trivial filesystem layouts

It is **not** intended as a convenience wrapper for exploratory R usage.

---

## Licensing

- rpy-bridge is released under the MIT License © 2025 Victoria Cheung
- Depends on [`rpy2`](https://rpy2.github.io), licensed under the GNU GPL (v2 or later)

---

## Acknowledgements

This package was spun out of internal tooling I wrote at Revolution Medicines.
Thanks to the team there for supporting its open-source release.
