Metadata-Version: 2.4
Name: krnel-graph
Version: 0.1.7
Summary: Lightweight dataflow library for mechanistic interpretability.
Author-email: Kimberly Wilber <kimmy+pypispam@krnel.ai>, Peyman Faratin <peyman+pypispam@krnel.ai>
License-Expression: Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fsspec>=2024.0.0
Requires-Dist: httpx>=0.20
Requires-Dist: numpy>=1.26
Requires-Dist: platformdirs>=4.0
Requires-Dist: pyarrow>=15.0
Requires-Dist: pydantic>=2.11.0
Requires-Dist: pydantic-settings>=2.5.0
Requires-Dist: structlog>=23.0.0
Requires-Dist: tqdm>=4
Requires-Dist: scikit-learn>=1.7.1
Requires-Dist: rich>=13.7
Requires-Dist: cyclopts>=3.22.5
Requires-Dist: humanize>=4.0.0
Provides-Extra: test
Requires-Dist: pytest>=7.0.0; extra == "test"
Requires-Dist: pytest-cov>=4.0.0; extra == "test"
Requires-Dist: ruff>=0.13.1; extra == "test"
Provides-Extra: docs
Requires-Dist: sphinx>=7.1.0; extra == "docs"
Requires-Dist: sphinx-autoapi>=3.0.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=2.0.0; extra == "docs"
Requires-Dist: myst-parser>=2.0.0; extra == "docs"
Requires-Dist: sphinx-autobuild>=2024.0.0; extra == "docs"
Requires-Dist: furo>=2025.7.19; extra == "docs"
Requires-Dist: autodoc-pydantic>=2.2.0; extra == "docs"
Requires-Dist: sphinxcontrib-mermaid>=0.9.2; extra == "docs"
Requires-Dist: autoclasstoc>=1.7.0; extra == "docs"
Requires-Dist: sphinx-autodoc-annotation>=1.0.post1; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints>=2.3.0; extra == "docs"
Provides-Extra: ml
Requires-Dist: torch==2.7.0; platform_system == "Darwin" and extra == "ml"
Requires-Dist: torch>=2.5.0; platform_system == "Linux" and extra == "ml"
Requires-Dist: torch>=2.5.0; platform_system == "Windows" and extra == "ml"
Requires-Dist: datasets>=3.0.0; extra == "ml"
Requires-Dist: transformer-lens>=2.15.0; extra == "ml"
Requires-Dist: transformers>=4.55.0; extra == "ml"
Requires-Dist: fsspec[gcs]>=2023.1.0; extra == "ml"
Requires-Dist: sentence-transformers>=5.1.0; extra == "ml"
Provides-Extra: viz
Requires-Dist: numba>=0.61.2; extra == "viz"
Requires-Dist: jupyter-scatter>=0.22.0; extra == "viz"
Requires-Dist: umap-learn>=0.5.8; extra == "viz"
Requires-Dist: seaborn>=0.13.2; extra == "viz"
Requires-Dist: hiplot>=0.1.33; extra == "viz"
Dynamic: license-file

# Krnel-graph
### [Docs](https://krnel-graph.readthedocs.io/en/latest/) • [Examples](https://github.com/krnel-ai/krnel-graph/tree/main/examples) • [Github](https://github.com/krnel-ai/krnel-graph) • [PyPI](https://pypi.org/project/krnel-graph/)

A **Python toolbox for mechanistic interpretability research** built on a **lightweight strongly-typed computation graph spec.**

- **Run language models** using HuggingFace Transformers, TransformerLens, Ollama, *etc.,* and save activations from the residual stream
- **Train linear probes** from cached activations and evaluate their results
- **Fetch logit scores** for guardrail models
- Load and prepare datasets

### Applications

- **Build better guardrails** using linear probes that understand model internals
- **Explore large datasets** grouped by semantic similarity
- **Vizualize high-dimensional embeddings** with built-in UMAP scatterplots
- Evaluate derivative experiments quickly with **full caching and provenance tracking** of results.
- **Infrastructure-agnostic**: Run in a notebook, on your GPU machine's CLI, or via the task orchestration framework of your choice!

![Krnel-graph figure](https://raw.githubusercontent.com/krnel-ai/krnel-graph/main/docs/_static/krnel-graph-hero.webp)

## Quick start

Krnel-graph works on the following platforms:

- MacOS (arm64, MPS, Apple M1 or better)
- Linux (amd64, CUDA)
- Windows native (amd64, CUDA)
- Windows WSL2 (amd64, CUDA)

Install from PyPI with uv:

```bash
$ uv add krnel-graph[cli,ml]

# (Optional) Configure where Runner() saves results
# Defaults to /tmp
$ uv run krnel-graph config --store-uri /tmp/krnel/
# s3://, gs://, or any fsspec url supported
```

Make `main.py` with the following definitions:

```python
from krnel.graph import Runner
runner = Runner()

# Load data
ds_train   = runner.from_parquet('data_train.parquet')
col_prompt = ds_train.col_text("prompt")
col_label  = ds_train.col_categorical("label")

# Get activations from a small model
X_train = col_prompt.llm_layer_activations(
    model="hf:gpt2",
    layer=-1,
)

# Train a probe on contrastive examples
train_positives = col_label.is_in({"positive_label_1", "positive_label_2"})
train_negatives = ~train_positives
probe = X_train.train_classifier(
    positives=train_positives,
    negatives=train_negatives,
)

# Get test activations by substituting training set with testing set
# (no need to repeat the entire graph)
ds_test = runner.from_parquet('data_test.parquet')
X_test = X_train.subs((ds_train, ds_test))

test_scores = probe.predict(X_test)
eval_result = test_scores.evaluate(
    gt_positives=train_positives.subs((ds_train, ds_test)),
    gt_negatives=train_negatives.subs((ds_train, ds_test)),
)

if __name__=="__main__":
    # All operations are lazily evaluated until materialized:
    print(runner.to_json(eval_result))
```

Then, inspect the results in a notebook:

```python
from main import runner, eval_result, X_train

# Materialize everything and print result:
print(runner.to_json(eval_result))

# Display activations of training set (GPU-intense operation)
print(runner.to_numpy(X_train))
```

Or use the (completely optional) `krnel-graph` CLI to materialize a selection of operations and/or monitor progress:

```shell
# Run parts of the graph
$ uv run krnel-graph run -f main.py -t LLMLayerActivations   # By operation type
$ uv run krnel-graph run -f main.py -s X_train               # By Python variable name

# Show status
$ uv run krnel-graph summary -f main.py

# Diff the pseudocode of two graph operations
$ uv run krnel-graph print -f main.py -s X_train > /tmp/train.txt
$ uv run krnel-graph print -f main.py -s X_test > /tmp/test.txt
$ git diff --no-index /tmp/train.txt /tmp/test.txt
```

