Metadata-Version: 2.3
Name: maur
Version: 0.1.4
Summary: Add your description here
Author: Mats E. Mollestad
Author-email: Mats E. Mollestad <mats@mollestad.no>
Requires-Dist: fastapi
Requires-Dist: sqlalchemy[asyncio]
Requires-Dist: sqlmodel
Requires-Dist: takk>=0.1.25
Requires-Dist: asyncpg
Requires-Python: >=3.10
Description-Content-Type: text/markdown

# Maur 🐜

Inspired by [Strip's Minions](https://stripe.dev/blog/minions-stripes-one-shot-end-to-end-coding-agents). Maur (Norwegian for "ants") is an autonomous coding agent that integrates into your Python projects. It receives tasks from various sources — production error alerts, Slack, Linear issues, or direct API calls — clones your repo, runs an AI coding agent ([OpenCode](https://opencode.ai)), and opens a pull/merge request with the fix.

## How it works

1. A task arrives via webhook or direct API call
2. The API stores the task and publishes it to a message queue
3. A worker picks up the task, clones your repo into a temporary workspace
4. OpenCode runs against the cloned repo using your configured LLM
5. If changes are made, they are committed and pushed to a new branch (`maur/<task-id>`)
6. A pull request (GitHub) or merge request (GitLab) is opened automatically

## Architecture

```
[Trigger source]           [maur_api]          [maur_code_subscriber]
  Linear webhook    --->   FastAPI app   --->   Worker (OpenCode)
  Exception alert          stores task          clones repo
  Manual POST /tasks       publishes msg        runs agent
                                                opens PR/MR
```

The two components are deployed separately via `takk`:
- **`maur_api`** — lightweight FastAPI service that authenticates requests, persists tasks, and enqueues work
- **`maur_code_subscriber`** — NATS subscriber that processes tasks one at a time using OpenCode

## Prerequisites

- Python ≥ 3.10
- [`takk`](https://pypi.org/project/takk/) for infrastructure management
- A NATS server (provisioned by `takk`)
- A PostgreSQL or MySQL database (provisioned by `takk`)
- An OpenAI-compatible LLM API (e.g. [OpenRouter](https://openrouter.ai), a local Ollama instance, or any provider with an OpenAI-compatible endpoint. `takk` default to using Ollama unless you overwrite the env vars.)
- A GitHub or GitLab repository with a token that has push and PR/MR creation permissions

## Installation

```bash
uv add maur
```

## Basic usage

### 1. Add the infrastructure

Add both components to your `project.py` file:

```python
from takk import Project
from maur.components import maur_api, maur_code_subscriber

project = Project(
    name="your-project",

    # The API that authenticates and enqueues tasks
    maur_api=maur_api(),

    # The worker that clones the repo, runs OpenCode, and opens a PR/MR
    maur_coder=maur_code_subscriber(git_provider="github"),
)
```

Both functions accept a `database` argument (`"psql"` or `"mysql"`, default: `"psql"`).

### 2. Configure secrets

Run `takk dotenv` to regenerate your `.env` file, then fill in the required values.

`takk` automatically provisions the database and NATS instance, so `DB_URI` and `NATS_URI` are not required when running through `takk`. For the LLM, `takk` defaults to a local Ollama instance when running locally and Scaleway when deployed — but you can override this with any OpenAI-compatible API.

**Required:**

| Variable | Description |
|---|---|
| `MAUR_BEARER_TOKEN` | Secret token used to authenticate API requests |

**If `git_provider="github"` (default):**

| Variable | Required | Description |
|---|---|---|
| `GITHUB_REPO_URL` | Yes | HTTPS URL of the GitHub repo to clone and open PRs on |
| `GITHUB_TOKEN` | Yes | GitHub personal access token with `repo` scope |
| `GITHUB_API_URL` | No | GitHub API base URL (default: `https://api.github.com`) |

**If `git_provider="gitlab"`:**

| Variable | Required | Description |
|---|---|---|
| `GITLAB_REPO_URL` | Yes | HTTPS URL of the GitLab repo |
| `GITLAB_TOKEN` | Yes | GitLab personal access token with `api` scope |
| `GITLAB_URL` | No | GitLab instance URL (default: `https://gitlab.com`) |


### 3. Start the system

```bash
takk up
```

Both the API and worker containers will be built and started.

## API reference

All endpoints (except `/health`) require a `Bearer` token in the `Authorization` header matching `MAUR_BEARER_TOKEN`.

### POST `/tasks` — Manual task

Send any arbitrary prompt to the agent.

```bash
curl -X POST http://localhost:8000/tasks \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Refactor the payment module to use the new Stripe SDK",
    "source_id": "unique-identifier-for-dedup",
    "repo_branch": "main"
  }'
```

### POST `/webhooks/exception` — Exception alert

Send a production error for the agent to fix. `fingerprint` is used for deduplication — tasks with the same fingerprint that are already pending or in progress are rejected.

```bash
curl -X POST http://localhost:8000/webhooks/exception \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "fingerprint": "KeyError-user-profile-views-42",
    "title": "KeyError: '\''email'\'' in user_profile view",
    "description": "Traceback (most recent call last):\n  ...",
    "repo_branch": "main",
    "extra": {"environment": "production", "user_id": 123}
  }'
```

### POST `/webhooks/linear` — Linear webhook

Configure a [Linear webhook](https://linear.app/docs/webhooks) to send issue events here. Maur extracts the issue title, description, and repository URL (must be attached to the issue) and opens a PR with the fix.

Set the webhook URL to: `https://<your-api-host>/webhooks/linear`

Optionally set `LINEAR_WEBHOOK_SECRET` in your environment to verify webhook signatures.

### GET `/tasks` — List tasks

Returns the 50 most recent tasks.

### GET `/tasks/{task_id}` — Get task

Returns the status and result of a specific task.

### GET `/health`

Returns `"ok"`. Used for health checks.

## Customisation

### Override the infrastructure

| Variable | Default | Description |
|---|---|---|
| `DB_URI` | Provisioned by `takk` | PostgreSQL (`postgresql://...`) or MySQL (`mysql://...`) connection URI |
| `NATS_URI` | Provisioned by `takk` | NATS connection URI (`nats://...`) |

### Changing the LLM model

Set `MAUR_LLM_MODEL` to any model available through your `MAUR_LLM_API` provider. The worker uses OpenCode with an OpenAI-compatible provider, so any model exposed via that protocol works.

```
MAUR_LLM_MODEL=devstral-2-123b-instruct-2512
```

### Adjusting worker compute resources

The default worker is allocated 3 GB of memory. Override this via the `compute` argument:

```python
from takk.models import Compute
from maur.components import maur_code_subscriber

maur_coder=maur_code_subscriber(
    compute=Compute(mb_memory_limit=1024 * 8)  # 8 GB
)
```

### Passing additional secrets to the worker

If your target repository requires environment variables at build or runtime (e.g. private package indexes), use `maur_code_subscriber_with_secrets` and pass the full list of secrets explicitly:

```python
from maur.components import maur_code_subscriber_with_secrets
from maur.settings import GithubSettings, MaurSettings, MaurLLMSettings, PostgresSettings
from takk.secrets import NatsConfig
from my_project.settings import MyPrivateRegistrySettings

maur_coder=maur_code_subscriber_with_secrets(
    secrets=[PostgresSettings, MaurSettings, MaurLLMSettings, GithubSettings, NatsConfig, MyPrivateRegistrySettings]
)
```

## Running without takk

`takk` is the easiest way to run and deploy Maur, but you can run both components directly if you prefer to manage infrastructure yourself.

### Docker Compose

The quickest way to run without `takk` is with Docker Compose. Create a `docker-compose.yml`:

```yaml
services:
  db:
    image: postgres:16
    environment:
      POSTGRES_USER: maur
      POSTGRES_PASSWORD: maur
      POSTGRES_DB: maur
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U maur"]
      interval: 5s
      retries: 5

  nats:
    image: nats:latest
    command: ["-js"]
    ports:
      - "4222:4222"

  api:
    build:
      context: .
      dockerfile: Dockerfile.api
    ports:
      - "8000:8000"
    environment:
      DB_URI: postgresql+asyncpg://maur:maur@db/maur
      NATS_URI: nats://nats:4222
      MAUR_BEARER_TOKEN: your-secret-token
    depends_on:
      db:
        condition: service_healthy
      nats:
        condition: service_started

  worker:
    build:
      context: .
      dockerfile: Dockerfile.worker
    environment:
      DB_URI: postgresql+asyncpg://maur:maur@db/maur
      NATS_URI: nats://nats:4222
      MAUR_BEARER_TOKEN: your-secret-token
      MAUR_LLM_API: https://your-llm-provider/v1
      MAUR_LLM_TOKEN: your-llm-token
      GITHUB_REPO_URL: https://github.com/your-org/your-repo
      GITHUB_TOKEN: ghp_...
    depends_on:
      db:
        condition: service_healthy
      nats:
        condition: service_started

volumes:
  postgres_data:
```

Then run:

```bash
docker compose up
```

See the [API](#api) and [Worker](#worker) sections below for the corresponding Dockerfiles.

### API

Install the package and start the FastAPI app with uvicorn:

```bash
pip install maur
uvicorn maur.app:app --host 0.0.0.0 --port 8000
```

You must supply all required environment variables manually — `takk` won't provision anything:

```
DB_URI=postgresql+asyncpg://user:pass@localhost/maur
NATS_URI=nats://localhost:4222
MAUR_BEARER_TOKEN=your-secret-token
```

### Worker

The worker calls `process_coding_task` for each NATS message. It requires a number of system-level dependencies (git, grep, curl, unzip, jq, bash) and [OpenCode](https://opencode.ai) to be installed in the environment.

A minimal Dockerfile for the worker:

```dockerfile
FROM python:3.12-slim

RUN apt-get update && apt-get install -y \
    curl ca-certificates bash git libstdc++6 libgcc-s1 unzip jq grep \
    && rm -rf /var/lib/apt/lists/*

# Install OpenCode
RUN curl -fsSL https://opencode.ai/install | bash

RUN pip install maur
```

You then need to wire up a NATS consumer that deserialises the message as `CodingTaskMessage` and calls `process_coding_task`:

```python
import asyncio
import json
import nats
from maur.worker import CodingTaskMessage, process_coding_task

async def main():
    nc = await nats.connect("nats://localhost:4222")
    js = nc.jetstream()

    async def handler(msg):
        await msg.ack()
        task_msg = CodingTaskMessage.model_validate(json.loads(msg.data))
        await process_coding_task(task_msg)

    await js.subscribe("maur.tasks", cb=handler, durable="maur-worker")
    await asyncio.Event().wait()

asyncio.run(main())
```

The worker also requires the database and Git provider environment variables set:

```
DB_URI=postgresql+asyncpg://user:pass@localhost/maur
NATS_URI=nats://localhost:4222
MAUR_BEARER_TOKEN=your-secret-token
MAUR_LLM_API=https://...
MAUR_LLM_TOKEN=your-llm-token
GITHUB_REPO_URL=https://github.com/your-org/your-repo
GITHUB_TOKEN=ghp_...
```

## Development

```bash
# Install dependencies
uv sync --all-groups

# Lint
uv run ruff check .

# Type check
uv run ty check
```
