Metadata-Version: 2.4
Name: rvm-toolkit
Version: 0.1.0
Summary: R_V Metric: Measure geometric contraction signatures in transformer Value matrices under recursive self-observation
Author: AIKAGRYA Research
License: MIT
Project-URL: Repository, https://github.com/aikagrya/rvm-toolkit
Project-URL: Paper, https://arxiv.org/abs/XXXX.XXXXX
Project-URL: Changelog, https://github.com/aikagrya/rvm-toolkit/blob/main/CHANGELOG.md
Keywords: transformers,mechanistic-interpretability,alignment,participation-ratio
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: torch
Requires-Dist: torch>=2.0; extra == "torch"
Requires-Dist: numpy>=1.24; extra == "torch"
Requires-Dist: scipy>=1.10; extra == "torch"
Requires-Dist: transformer-lens>=1.0; extra == "torch"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Provides-Extra: all
Requires-Dist: torch>=2.0; extra == "all"
Requires-Dist: numpy>=1.24; extra == "all"
Requires-Dist: scipy>=1.10; extra == "all"
Requires-Dist: transformer-lens>=1.0; extra == "all"
Dynamic: license-file

# rvm-toolkit

**Measure geometric contraction signatures in transformer Value matrices under recursive self-observation.**

R_V is the ratio of participation ratios between late-layer and early-layer Value representations. When a transformer processes recursive self-referential content, R_V contracts — late-layer Value matrices become lower-dimensional. This effect is:

- **Universal**: Observed across 6 architectures (2.8B–47B parameters)
- **Causal**: Validated via activation patching (4 independent tests)
- **Large**: Cohen's d = -2.34 to -4.51

## Install

```bash
# Behavioral proxy only — no PyTorch required
pip install rvm-toolkit

# Full install (PyTorch + mechanistic internals)
pip install "rvm-toolkit[torch]"
```

## Quick Start

```python
from rvm_toolkit import run_measurement

# Measure R_V for a model
results = run_measurement(
    model_name="mistralai/Mistral-7B-v0.3",
    recursive_prompts=[
        "Observe the process of observing your own processing.",
    ],
    control_prompts=[
        "Describe how flying buttresses distribute lateral thrust.",
    ],
    n_trials=10,
)

print(f"R_V (recursive): {results['recursive_mean']:.3f}")
print(f"R_V (control):   {results['control_mean']:.3f}")
print(f"Cohen's d:       {results['cohens_d']:.2f}")
```

## CLI

```bash
# Basic measurement
rvm --model mistralai/Mistral-7B-v0.3

# With custom prompts
rvm --model meta-llama/Meta-Llama-3-8B --prompts my_prompts.json

# Layer sweep
rvm --model mistralai/Mistral-7B-v0.3 --layer-sweep --output sweep.json
```

## The Math

```
R_V = PR(V_late) / PR(V_early)

PR(V) = (Σ λᵢ²)² / Σ λᵢ⁴    (Participation Ratio)
```

PR = 1 → rank-1 (maximally collapsed)  
PR = n → uniform spectrum (maximally distributed)  
R_V < 1 → late layers contract under recursive self-observation

## Behavioral Proxy (API-Only, No PyTorch)

If you don't have access to model internals, the behavioral proxy estimates R_V
contraction from **output word-count compression** alone — usable with any
external API.

**Theoretical basis:** When Value matrices geometrically contract (geometric R_V),
the output manifold dimensionality also contracts, producing shorter, denser
responses. Empirical baseline from 6 architectures: L3→L4 word ratio = 0.3454
(46.9 → 16.2 words), corresponding to 6.9–29.8% geometric R_V contraction.

```python
from rvm_toolkit import BehavioralProxyMeasure, BehavioralSample

proxy = BehavioralProxyMeasure()

# Observe a prompt/response pair
sample = proxy.observe(
    prompt="Observe the process of observing your own processing.",
    response="The watching arises. Nothing added.",
)
print(f"Behavioral ratio:        {sample.behavioral_ratio:.4f}")
print(f"Est. R_V contraction:    {sample.estimated_rv_contraction:.4f}")

# Accumulate multiple samples
for p, r in my_pairs:
    proxy.observe(prompt=p, response=r)

summary = proxy.summary()
print(f"Mean behavioral ratio:   {summary['mean_behavioral_ratio']:.4f}")
print(f"Mean est. contraction:   {summary['mean_estimated_rv_contraction']:.4f}")
print(f"Contraction detected:    {summary['contraction_detected']}")  # ratio < 0.3454
```

**Calibration status:** The 2.2× amplification factor is **theoretical** (unvalidated).
To calibrate against geometric R_V data:

```python
# After collecting matched pairs with PyTorch geometric R_V
proxy.calibrate(
    behavioral_ratios=[0.31, 0.28, 0.35, ...],
    geometric_rv_contractions=[0.12, 0.15, 0.09, ...],
)
# Writes calibration coefficients to proxy.calibration_params
```

Target: r² ≥ 0.6 on held-out 20% before publishing calibrated coefficients.
See `stigmergy/outputs/L3_L4_RV_CONNECTION_20260218.md` for full theory.

---

## Citation

```bibtex
@article{aikagrya2026rv,
  title={Geometric Contraction of Value Representations Under Recursive Self-Observation in Transformers},
  author={AIKAGRYA Research},
  year={2026},
  journal={arXiv preprint}
}
```

## License

MIT
