Metadata-Version: 2.4
Name: hermes-memory
Version: 0.2.0
Summary: Persistent, structured memory layer for LLM agents — MCP-native, zero infra, SQLite-backed.
License: MIT
Keywords: agent,context-compression,llm,mcp,memory
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: mcp>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Description-Content-Type: text/markdown

# hermes-memory

Persistent, structured memory for LLM agents.
MCP-native. Zero infrastructure. SQLite-backed. Language-agnostic.

---

## The problem

During long sessions, context compression removes older messages.
Once compressed, those messages are gone from active context, not in memory files,
and not searchable — even mid-session.

The agent forgets what was decided three hours ago.

## The approach

hermes-memory does not fight compression. It works before it.

Every constraint, decision, and value is extracted into a structured fact
using a terse notation (MEMORY_SPEC). Facts are stored in SQLite with
full-text search, a two-tier hot/cold architecture, and automatic
scope lifecycle management.

The context injection stays under 180 tokens regardless of session length.
Cold facts are retrieved on demand with zero token waste.

## Notation (MEMORY_SPEC v1.0)

```
C[target]: constraint   — immutable project law
D[target]: decision     — technical choice, validated
V[target]: value        — IP, port, key, URL, variable
?[target]: unknown      — open question
✓[target]: done         — scope archivable
~[target]: obsolete     — replaces a previous fact
```

Examples:

```
C[db.id]: UUID mndtry, nvr autoincrement        (-65% vs raw message)
D[auth]: JWT 7j refresh 6j                      (-70%)
V[srv.prod]: api.example.com:3005               (-74%)
✓[auth]: deployed prod                          (-78%)
```

## How memory stays bounded

Facts live in one of two tiers:

```
HOT    active facts, injected at session start, ~150 tokens
COLD   SQLite, unlimited, retrieved on demand by FTS5 search
```

When the hot tier fills, pressure levels act automatically:

```
70%    merge duplicate facts sharing same target+scope
85%    push closed-scope facts from cold to archived
95%    consolidate via LLM call (last resort) or push oldest to cold
```

The cold tier has no size limit. A search on cold returns max 20 facts.

## Scope lifecycle

A scope is a unit of work (feature, phase, bug fix).
It opens implicitly on first fact write and closes when:

1. A closing signal appears in the message ("merged", "deployed", "it works")
2. Six turns pass with no reference to the scope
3. Three consecutive turns write facts for a different scope

Closed scopes move to cold automatically. Their facts never pollute future sessions.

## Installation

```bash
pip install hermes-memory
```

## MCP server

```bash
hermes-memory
```

Or in your MCP config:

```json
{
  "mcpServers": {
    "hermes-memory": {
      "command": "hermes-memory"
    }
  }
}
```

Set `HERMES_MEMORY_DB` to override the default storage path (`~/.hermes/memory.db`).

## Tools

| Tool | When to call |
|---|---|
| `memory_write(content, scope?)` | Any constraint, decision, or value established |
| `memory_search(query, scope?, limit?)` | Before answering on a topic with history |
| `memory_tick(turn, message?)` | Every user message |
| `memory_status()` | Session start |
| `memory_reflect(topic, limit?)` | User asks about history on a topic |
| `memory_export(scope?, status?)` | Snapshot all facts as plain notation |
| `memory_purge(scope?, older_than_days?)` | After closing a scope or periodic GC |

## System prompt block

Add to your agent's system prompt (output of `memory_status`):

```
[MEMORY_SPEC v1.0]

NOTATION
C[t]: constraint  D[t]: decision  V[t]: value
?[t]: unknown     ✓[t]: done      ~[t]: obsolete
-> flows  ! critical  group by key

ABBREVS
cfg impl msg req usr resp prod feat dev deps auth err db btn
env doc perf init mgmt refct mvmt notif perms val async sync
mndtry nvr alw tmp idx tbl svc pkg repo api clt srv

RULES
- call memory_write() for any C/D/V/? detected
- call memory_search() before answering on known topics
- call memory_tick(turn, message) on every user message
```

## Compatibility

Works with any MCP-compatible agent: Hermes, Claude Desktop, Cursor, Continue, and others.
No cloud. No API key. No embedding model required.

## Architecture

```
hermes_memory/
    core/
        db.py        SQLite connection, schema, constants
        facts.py     CRUD, contradiction detection, FTS5 search
        scopes.py    scope lifecycle, auto-cooling, topic shift
        gauge.py     pressure levels, merge, archive, synthesis
    mcp/
        server.py    MCP server, 7 tools
    spec/
        MEMORY_SPEC.md   notation reference
```

## License

MIT
