Metadata-Version: 2.4
Name: tdos-memory
Version: 1.0.0
Summary: Production-grade typed memory system for user-facing AI agents with confidence scoring and anti-hallucination
Author-email: TownesDev <donovan@townes.dev>
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/townesdev/tdos-memory
Project-URL: Documentation, https://github.com/townesdev/tdos-memory/blob/main/docs/README.md
Project-URL: Repository, https://github.com/townesdev/tdos-memory
Project-URL: Issues, https://github.com/townesdev/tdos-memory/issues
Project-URL: Changelog, https://github.com/townesdev/tdos-memory/blob/main/docs/CHANGELOG.md
Keywords: memory,llm,ai,agents,confidence,decay,extraction,facts,observable,typed,discord,chatbot
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: mongodb
Requires-Dist: pymongo>=3.12; extra == "mongodb"
Provides-Extra: postgres
Requires-Dist: psycopg2-binary>=2.9; extra == "postgres"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: flake8>=6.0; extra == "dev"
Requires-Dist: isort>=5.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=5.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.0; extra == "docs"
Dynamic: license-file

# TDOS Memory Envelope v1.0

**Typed, Decaying, Observable Storage Memory System**

A production-grade memory system for user-facing AI agents.
Emphasizes safety, auditability, and anti-hallucination.

> **Status**: Production Ready ✅ | **License**: Apache 2.0 | **Author**: TownesDev

---

## Why TDOS Memory Exists

Most AI memory systems blur the line between facts, inferences, and warmth.
This creates three problems:

1. **Hallucination**: LLMs invent memories that were never mentioned
2. **Creepiness**: Invisible inference creates uncanny behavior
3. **Auditability**: You can't see why the AI made a decision

TDOS solves this by being **explicit about three things**:

- **What you know** (facts)
- **How you know it** (origin tracking)
- **Why you trust it** (confidence scoring)

---

## Core Concepts

### The Memory Envelope

All user memory is packaged in a single, cached object:

```python
envelope = {
    "identity": {
        "name": "Alice",
        "user_id": "123"
    },
    "relational": {
        "domains": ["music_production"],
        "preferences": {"communication_style": "casual"},
        "memorable_facts": [
            {
                "text": "loves fettuccini",
                "type": "USER_FACT",
                "confidence": 0.85,
                "origin": "explicit",
                "confirmation_state": "confirmed"
            }
        ]
    },
    "recent_context": {
        "last_session_summary": "..."
    }
}
```

**Why envelope?** Because memory isn't a database query—it's a small, carefully curated snapshot.
Fast, cacheable, auditable.

### Three Memory Classes

| Type                 | Purpose           | Confidence | Decay   | Used For                            |
| -------------------- | ----------------- | ---------- | ------- | ----------------------------------- |
| **USER_FACT**        | Durable truths    | ≥ 0.80     | 30 days | Personalization, consistency        |
| **USER_PATTERN**     | Behavioral traits | ≥ 0.75     | 14 days | Style adaptation, learning          |
| **SHARED_NARRATIVE** | Warm memories     | ≥ 0.60     | 7 days  | Relationship flavor (not inference) |

**Key rule**: `SHARED_NARRATIVE` memories are NEVER used for inference or decision-making.
They're purely relational—"we have inside jokes together."

### Confidence Scoring

Every fact carries a trust level (0-1):

- **0.9-1.0**: Explicitly stated with specifics
- **0.7-0.9**: Clearly stated with detail
- **0.5-0.7**: Implied or mentioned briefly
- **<0.5**: Rejected outright

Confidence is visible to the LLM:

```
Known facts:
  • loves fettuccini (confidence: 85%)
  • works on FL Studio (confidence: 90%)
```

### Origin Tracking

Track how each fact was obtained:

```python
"origin": "explicit",  # User stated directly
"origin": "inferred",  # Reasonably inferred
"origin": "reinforced",  # User mentioned again
```

### Confirmation States

Track whether facts have been validated:

```python
"confirmation_state": "unconfirmed",  # New fact
"confirmation_state": "confirmed",     # User reaffirmed
"confirmation_state": "contradicted"    # User corrected it
```

### Time-Based Decay

Facts get stale. Confidence decays if not reconfirmed:

- **USER_FACT** loses 10% confidence every 30 days
- **USER_PATTERN** loses 10% every 14 days
- **SHARED_NARRATIVE** loses 10% every 7 days

Below 0.3 confidence → fact is pruned.

---

## Installation

### From Source

```bash
cd tdos_memory
pip install -e .
```

### With MongoDB Support

```bash
pip install -e ".[mongodb]"
```

### Development

```bash
pip install -e ".[dev]"
pytest
```

---

## Quick Start

### Get Memory Envelope

```python
from tdos_memory import get_memory_envelope, format_envelope_for_llm

# Retrieve cached memory (15-minute TTL)
envelope = get_memory_envelope(user_id="alice", guild_id="server_1")

# Format for LLM context
context = format_envelope_for_llm(envelope)
print(context)
# Output:
# User: Alice
# Domains: music_production, FL_Studio
# Known facts:
#   • loves fettuccini (confidence: 85%)
#   • works on FL Studio (confidence: 90%)
```

### Extract Facts from Summary

```python
from tdos_memory import extract_facts_from_summary

summary = """
Alice mentioned she uses FL Studio for beat making.
She loves fettuccini pasta and wants to improve her mixing skills.
"""

facts = extract_facts_from_summary(summary, user_id="alice")
# [
#   {"text": "uses FL Studio", "confidence": 0.92, "origin": "explicit", ...},
#   {"text": "loves fettuccini", "confidence": 0.85, "origin": "explicit", ...}
# ]
```

### Add Memorable Fact

```python
from tdos_memory import add_memorable_fact, invalidate_cache

success = add_memorable_fact(
    user_id="alice",
    guild_id="server_1",
    fact="loves fettuccini",
    confidence=0.85,
    origin="explicit"
)

# Cache is automatically invalidated
```

### Apply Confidence Decay

```python
from tdos_memory import apply_decay

updated_facts = apply_decay(envelope["relational"]["memorable_facts"])
# Facts older than their decay period have reduced confidence
# Facts below 0.3 are removed
```

### Reinforce a Fact

```python
from tdos_memory import reinforce_fact

# User mentions fact again → boost confidence
reinforce_fact(
    user_id="alice",
    guild_id="server_1",
    fact_text="loves fettuccini",
    boost=0.15  # Increase confidence by 15%
)
```

### Shared Narratives (Warm Memories)

```python
from tdos_memory import add_shared_narrative, get_shared_narratives

# Add a warm memory (not used for inference)
add_shared_narrative(
    user_id="alice",
    guild_id="server_1",
    memory="Abby calls Alice 'Ace the Mixer'",
    tone="playful"
)

# Retrieve warm memories
narratives = get_shared_narratives(user_id="alice", guild_id="server_1")
```

---

## Architecture

```
┌─────────────────────────────────────────────────┐
│ [L3] Consumer Agents (Abby, Clerk, Scribe)     │
└──────────────────┬──────────────────────────────┘
                   │
┌──────────────────▼──────────────────────────────┐
│ [L2] Envelope (Formatting, Caching)            │
│      get_memory_envelope()                      │
│      format_envelope_for_llm()                  │
└──────────────────┬──────────────────────────────┘
                   │
┌──────────────────▼──────────────────────────────┐
│ [L1] Extraction & Validation                   │
│      extract_facts_from_summary()               │
│      validate_fact_against_summary()            │
│      reinforce_fact()                           │
└──────────────────┬──────────────────────────────┘
                   │
┌──────────────────▼──────────────────────────────┐
│ [L0] Storage (MongoDB/Custom Backend)          │
│      discord_profiles.creative_profile          │
│      shared_narratives collection               │
└─────────────────────────────────────────────────┘
```

---

## API Reference

### `envelope` Module

#### `get_memory_envelope(user_id, guild_id=None, force_refresh=False)`

Retrieve or build a memory envelope for a user.

- **Returns**: Dict with identity, relational, recent_context
- **Cache TTL**: 900 seconds (15 minutes)
- **force_refresh**: If True, bypasses cache

#### `format_envelope_for_llm(envelope, max_facts=5)`

Format envelope into LLM-friendly text.

- **Returns**: Formatted string for system prompt
- **Example**: "User: Alice\nDomains: music_production\nKnown facts:\n • loves fettuccini (confidence: 85%)"

#### `add_memorable_fact(user_id, guild_id, fact, origin="explicit", confirmation_state="unconfirmed")`

Add a fact to user's memory.

- **origin**: "explicit", "inferred", or "reinforced"
- **confirmation_state**: "unconfirmed", "confirmed", or "contradicted"
- **Invalidates cache** automatically

#### `invalidate_cache(user_id, guild_id=None)`

Clear cached memory for a user.

---

### `extraction` Module

#### `extract_facts_from_summary(summary, user_id)`

Use LLM to extract facts from conversation summary.

- **Returns**: List of fact dicts with text, type, confidence, origin
- **Source of truth**: Summary only (never raw exchanges)
- **Anti-hallucination**: Facts validated against summary

#### `validate_fact_against_summary(fact_text, summary, min_match_ratio=0.5)`

Check if fact is grounded in summary.

- **Returns**: True if ≥50% of key words appear in summary
- **Prevents hallucination**: LLM can't invent facts

#### `analyze_conversation_patterns(summary, user_id, existing_profile=None)`

Infer behavioral patterns from summary.

- **Returns**: Proposed updates with confidence
- **NOT auto-applied**: Requires confirmation if confidence < 0.8

#### `reinforce_fact(user_id, guild_id, fact_text, boost=0.15)`

Boost confidence when user mentions fact again.

- **boost**: Amount to increase confidence (max 0.95)
- **Fuzzy match**: Approximate string matching

#### `add_shared_narrative(user_id, guild_id, memory, tone="playful")`

Add warm memory (not used for inference).

- **tone**: "playful", "warm", "funny", "inside_joke"
- **User-deletable**: Can be removed anytime
- **Never inferred from**: Only explicitly added

---

### `decay` Module

#### `apply_decay(facts, reference_date=None)`

Apply time-based decay to facts.

- **Removes**: Facts below 0.3 confidence
- **USER_FACT**: Decays every 30 days
- **USER_PATTERN**: Decays every 14 days
- **SHARED_NARRATIVE**: Not decayed (has own expiry)

#### `boost_confidence(fact, boost=0.15, max_confidence=0.95)`

Manually boost fact confidence.

- **Caps** at 0.95 to prevent over-assertion

#### `apply_contradiction(fact, penalty=0.5)`

Penalize fact when user contradicts it.

- **Sets**: confirmation_state to "contradicted"
- **Reduces** confidence

#### `prune_by_confidence_threshold(facts, threshold=0.3)`

Remove low-confidence facts.

- **Default**: Prunes below 0.3
- **Reversible**: Pruned facts still in storage

---

## Best Practices

### ✅ Do

- **Use summaries**: Extract from conversation summaries, not raw text
- **Validate facts**: Call `validate_fact_against_summary()` to prevent hallucination
- **Track origin**: Always set `origin` field (explicit, inferred, reinforced)
- **Set confidence**: Use realistic confidence levels (0.5-1.0, never higher)
- **Hedge language**: Say "appears to" not "definitely" for confidence < 0.85
- **Separate concerns**: Keep SHARED_NARRATIVE separate from factual inference
- **Cache wisely**: 15-minute TTL balances freshness and performance
- **Decay constantly**: Run `apply_decay()` when retrieving old facts

### ❌ Don't

- **Invent facts**: Never add memories not explicitly stated
- **Over-assert**: Don't treat 0.7 confidence like 0.99
- **Forget decay**: Don't keep old facts at full confidence forever
- **Confuse types**: Don't use SHARED_NARRATIVE for inference
- **Trust LLM extraction alone**: Always validate with `validate_fact_against_summary()`
- **Mutate envelopes**: Copy before modifying
- **Bypass storage**: Don't assume cache = source of truth

---

## Design Philosophy

### 1. **Summary-as-Source-of-Truth**

Extract from conversation summaries, never raw exchanges.

**Why?**

- Reduces hallucination (fewer words to invent from)
- Makes memory auditable (summaries are readable)
- Enables offline extraction (summaries are stable)

### 2. **Explicit Confidence Scoring**

All facts carry visible trust levels.

**Why?**

- LLM adjusts behavior based on confidence
- Users understand why AI made decisions
- Prevents over-assertion

### 3. **Anti-Hallucination Validation**

Facts must be grounded in source text.

**Why?**

- Prevents LLM from inventing memories
- Makes facts auditable
- Creates accountability

### 4. **Typed Memory Classes**

Facts ≠ patterns ≠ warmth.

**Why?**

- Prevents creepiness (warm memories isolated from inference)
- Enables different decay rates
- Makes storage efficient

### 5. **Time-Based Forgetting**

Confidence decays over time.

**Why?**

- Memories naturally fade
- Prevents stale facts from dominating
- Matches human psychology

### 6. **Origin & Confirmation Tracking**

Know how and whether facts were validated.

**Why?**

- Enables contradiction handling
- Supports user agency (can say "you're wrong")
- Enables future ML (learn which sources are reliable)

---

## FAQ

### Q: Can TDOS Memory replace vector databases (RAG)?

**A**: No, they're complementary.

- **TDOS**: User-specific memories (facts about the user)
- **RAG**: General knowledge retrieval (facts about topics)

Use both: TDOS for "what do I know about this user?" + RAG for "what do I know about this topic?"

### Q: Why not store all conversation history?

**A**: Three reasons:

1. **Cost**: Storage scales with message volume
2. **Context window**: Can't fit all history in LLM
3. **Privacy**: Users may want old conversations forgotten

Summaries are the sweet spot: compressed, auditable, privacy-respecting.

### Q: What if the LLM extracts wrong facts?

**A**: That's why we validate.

```python
facts = extract_facts_from_summary(summary, user_id)
# LLM might invent: "User codes in Python"
# Validation fails if "Python" not in summary
# Fact is rejected ✅
```

### Q: Can users delete memories?

**A**: Yes, both types:

- **USER_FACT**: Can be contradicted or manually deleted
- **SHARED_NARRATIVE**: Explicitly user-deletable

```python
delete_shared_narrative(user_id, "Abby calls them Ace")
```

### Q: How do I integrate with my own database?

**A**: Pass a custom `storage_client`:

```python
get_memory_envelope(user_id, storage_client=my_db_client)
add_memorable_fact(..., storage_client=my_db_client)
```

TDOS is storage-agnostic. MongoDB is optional.

### Q: Does TDOS work offline?

**A**: Partially.

- **Envelope retrieval**: Yes (cached in-memory)
- **Fact extraction**: No (requires LLM)
- **Decay application**: Yes (purely computational)

---

## Contributing

TDOS is Apache 2.0 licensed. Contributions welcome!

Areas for improvement:

- [ ] Embedding-based validation (better hallucination detection)
- [ ] Web UI for memory management
- [ ] PostgreSQL/MySQL storage backends
- [ ] Async/await support
- [ ] Memory compression (summarize old facts)
- [ ] Contradiction resolution (pick correct version)
- [ ] Memory export/import

---

## Changelog

### v1.0.0 (2026-01-01)

- ✅ Typed memory classes (USER_FACT, USER_PATTERN, SHARED_NARRATIVE)
- ✅ Confidence scoring and gating
- ✅ Anti-hallucination validation
- ✅ Origin and confirmation_state tracking
- ✅ Time-based decay
- ✅ LLM-based extraction
- ✅ Shared narrative (warm memories) support
- ✅ MongoDB integration
- ✅ Envelope caching (15-minute TTL)
- ✅ Production-ready

---

## License

Apache 2.0 — Attribution required.

See LICENSE file for details.

---

## Authors

- **TownesDev** — Original design and implementation

---

## Acknowledgments

This system was developed for **Abby**, a Discord AI companion.
It's now a standalone library for any agent that needs safe, auditable memory.

**Philosophy**: Make memory systems as transparent as the agents that use them.
