Metadata-Version: 2.4
Name: dockyter
Version: 0.2.2
Summary: Run heavy tools in Docker or via an API from Jupyter notebooks
Author-email: Simon Masserey <simasserey@gmail.com>
License-Expression: MIT
Classifier: Framework :: IPython
Classifier: Framework :: Jupyter
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: ipython>=9.8.0
Requires-Dist: requests>=2.32.5
Dynamic: license-file

# Dockyter

Dockyter is an IPython extension that adds:

- a `%%docker` **cell magic** to run whole cells inside Docker containers,
- an optional `!` **shell redirection** so that `!cmd` runs inside Docker,
- and a **pluggable backend** system:
  - **local Docker daemon** (default),
  - or a **remote HTTP API** backend.

The goal is to run heavy CLI tools packaged as Docker images from notebooks,
while keeping the base Python environment light and reproducible.

Typical use cases:

- running ML frameworks, data validation tools, or internal CLIs from Docker images,
- keeping notebook kernels small and simple,
- using the same Dockerised tools across local Jupyter, JupyterHub, and Binder-like deployments,
- delegating container execution to a remote HTTP API instead of the local Docker daemon.

---

## Installation

```bash
pip install dockyter
```

Then in a notebook:

```python
%load_ext dockyter
```

By default, Dockyter expects:

* a working `docker` CLI on `PATH`, and
* access to a Docker-compatible daemon (or rootless runtime)

in order to run containers with the **Docker backend**.

If you use the **API backend**, the Docker daemon can live on a separate machine:
Dockyter just talks HTTP to your API.

---

## Backends: Docker daemon vs HTTP API

Dockyter has two backends:

* **Docker backend** (default)
  Runs containers by calling the local `docker` CLI:

  ```bash
  docker run --rm [ARGS] IMAGE bash -lc "cmd"
  ```

* **API backend**
  Sends commands to an HTTP API that you implement and manage.

You can switch backend at runtime:

```python
# Use local Docker daemon (default)
%docker_backend docker

# Use HTTP API backend
%docker_backend api http://127.0.0.1:8000
```

The current backend and status can be inspected with:

```python
%docker_status
```

This prints:

* which backend is active (`Docker` or `API`),
* whether it appears available,
* the current Docker arguments (image, volumes, etc.),
* whether `!` redirection is enabled.

---

## API backend contract

The API backend is intentionally small and simple.
Dockyter only assumes **two endpoints**:

1. **Health check**

   ```http
   GET /health
   ```

   * Must return a **2xx** status code if the backend is available.
   * Response body is ignored by Dockyter.

2. **Command execution**

   ```http
   POST /execute
   Content-Type: application/json
   ```

   Request body:

   ```json
   {
     "cmd": "echo hello",
     "args": "ubuntu:22.04 -v /host:/data"
   }
   ```

   Response body (JSON):

   ```json
   {
     "stdout": "hello",
     "stderr": ""
   }
   ```

Dockyter:

* passes the **entire cell** or `!` command as `cmd`,
* passes the raw `%docker` / `%%docker` arguments as `args`,
* prints `stdout` to the notebook,
* prints `stderr` in **red** if not empty.

What happens inside the API is entirely up to you. A typical implementation:

* receives `cmd` and `args`,
* constructs a `docker run ...` command on the server,
* captures `stdout` / `stderr`,
* returns them in JSON.

But the API **does not have to use Docker** internally; it could use Kubernetes, a job queue, or anything else — Dockyter only cares about the HTTP contract above.

### Security responsibility

Dockyter **does not** implement any security for the API backend.

* Authentication, authorisation, rate limiting, logging, etc. are entirely the responsibility of the API owner.
* The example API server in this repository is **not** intended for exposure on the public internet. It is a minimal reference implementation for local / trusted environments.
* A real deployment must:

  * protect the API (auth, HTTPS),
  * control which images and arguments are allowed,
  * run on a hardened host.

Dockyter only provides a convenient **client** for this API from inside notebooks; it is **not** a security boundary.

---

## Basic usage

### Cell magic: `%%docker` (recommended)

```python
%%docker myorg/tool:latest
echo "Hello from inside the container"
pwd
```

The **entire cell** is sent to `bash -lc` inside a **single container**.
All lines share the same shell state:

* `cd` persists for the rest of the cell,
* environment variables set in one line are visible to the others,
* multi-line scripts, `if`/`for`, heredocs, etc. work as expected.

This is the recommended way to run anything non-trivial in Docker from a notebook.

The same syntax works with both backends:

* Docker backend → runs `docker run ...` locally.
* API backend → sends `cmd` and `args` to `POST /execute`.

---

### Line magic: `%docker` + `!` redirection

```python
%docker -v /host/path:/data myorg/tool:latest
```

Then:

```python
!tool --input /data/file.txt
```

Here `%docker` **configures** Dockyter:

* Docker arguments and image are stored,
* subsequent `!cmd` calls in that notebook are rerouted to the active backend:

  * Docker backend → `docker run --rm [ARGS] IMAGE bash -lc "cmd"`
  * API backend → `POST /execute` with `cmd="cmd"` and `args="[ARGS] IMAGE"`

Important behaviour:

* each `!cmd` runs in a **fresh container** (or fresh backend execution),
* shell state is **not** shared between `!` calls:

  ```python
  %docker myimage:latest
  !cd /data
  !pwd   # runs in a new container. Not in /data
  ```

For anything that relies on `cd`, multi-line shell logic, or persistent state, prefer `%%docker`.
`%docker` + `!` is best for simple one-shot commands.

---

## Commands

* `%%docker [DOCKER ARGS...] IMAGE[:TAG]`
  Run the cell content in a single Docker container with the given image/arguments,
  using the currently selected backend.

* `%docker [DOCKER ARGS...] IMAGE[:TAG]`
  Configure “Docker mode” for `!` so that each `!cmd` is executed inside a container
  via the currently selected backend. (`%docker_on` is effectively activated.)

* `%docker_off`
  Restore the original `!` behaviour (no Docker redirection).

* `%docker_on`
  Activate Docker mode for `!` again, using the last configured image/arguments.

* `%docker_status`
  Show the current backend type, its availability, whether `!` redirection is enabled,
  and which image/arguments are currently configured.

* `%docker_backend docker`
  Use the local Docker daemon backend. (call %docker_status automatically after)

* `%docker_backend api <URL>`
  Use an HTTP API backend at the given base URL (for example `http://127.0.0.1:8000`). (call %docker_status automatically after)

---

## Binder / JupyterHub integration (high-level)

In BinderHub / JupyterHub, Dockyter can be used in a few ways:

### 1. Extension-only (safest default)

- Install `dockyter` in the image (e.g. via `requirements.txt`).
- Users do `%load_ext dockyter` in notebooks.
- If `docker` is not available in the container, Dockyter just reports it and does not crash.

This is the right choice for **public / untrusted** notebook environments.

### 2. Direct Docker daemon access (trusted only)

You can expose a Docker runtime inside user containers and use the **Docker backend**.

This is **very dangerous** for public notebooks:

- Users can bypass Dockyter and run `!docker ...` directly,
- including flags like `--privileged` or `--network=host`.

Only consider this if users are trusted and the platform is carefully locked down.

### 3. API backend (recommended for untrusted users)

A safer option for public or multi-tenant setups:

- Do **not** expose `docker` in user containers.
- Run a separate, hardened Dockyter-compatible **API backend**.
- In notebooks, use:

```python
  %docker_backend api https://your-secure-api.example.com
```

The API is then responsible for all security (auth, allowed images/flags, rate limiting, etc.).

Dockyter is **not** a security boundary; it only provides convenience and light guardrails.
Real isolation must come from the surrounding platform or the API implementation.

---

## Examples

This repository includes several example notebooks and an example API server:

* `docs/exemples/01_local_cli.ipynb`
  Run simple commands in a **local Docker image** (`%%docker` basics).

* `docs/exemples/02_ml_tool_in_docker.ipynb`
  Use a real ML framework (e.g. PyTorch) inside Docker, keeping the notebook kernel light.

* `docs/exemples/03_api_backend.ipynb`
  Use Dockyter with the **API backend**, switching with `%docker_backend api` and running
  commands via the HTTP API instead of the local Docker daemon.

* `docs/api_example/server.py`
  Minimal example of a Dockyter-compatible API implemented with FastAPI + Uvicorn.
  This is a reference implementation for local / trusted environments only.

---

## Tests

Dockyter has:

* **Unit tests** for:

  * `DockerBackend` and `APIBackend` (using monkeypatch for `subprocess` / `requests`),
  * the IPython magics layer.

* **Integration tests** that:

  * execute the example notebooks via `nbconvert`,
  * start the example API server for API backend tests,
  * fail if Dockyter prints error messages in red.

You can run them locally with:

```bash
# Unit tests only
uv run pytest -m "not integration"

# Full test suite (including notebook + API integration)
uv run pytest
```
