Metadata-Version: 2.4
Name: mcal-ai
Version: 0.2.8
Summary: Memory-Context Alignment Layer for Goal-Coherent AI Agents
Author: MCAL Team
License: MIT
Project-URL: Homepage, https://github.com/Shivakoreddi/mcal-ai
Project-URL: Documentation, https://github.com/Shivakoreddi/mcal-ai#readme
Project-URL: Repository, https://github.com/Shivakoreddi/mcal-ai.git
Project-URL: Issues, https://github.com/Shivakoreddi/mcal-ai/issues
Keywords: llm,memory,agents,context,ai,nlp
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: anthropic>=0.18.0
Requires-Dist: openai>=1.0.0
Requires-Dist: boto3>=1.28.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: faiss-cpu>=1.7.4
Requires-Dist: sentence-transformers>=2.2.0
Requires-Dist: sqlalchemy>=2.0.0
Requires-Dist: aiosqlite>=0.19.0
Requires-Dist: tiktoken>=0.5.0
Requires-Dist: tenacity>=8.2.0
Requires-Dist: rich>=13.0.0
Requires-Dist: python-dotenv>=1.0.0
Provides-Extra: langgraph
Requires-Dist: langgraph>=0.0.40; extra == "langgraph"
Requires-Dist: langchain-core>=0.1.0; extra == "langgraph"
Provides-Extra: crewai
Requires-Dist: crewai>=0.28.0; extra == "crewai"
Provides-Extra: autogen
Requires-Dist: pyautogen>=0.2.0; extra == "autogen"
Provides-Extra: langchain
Requires-Dist: langchain>=0.1.0; extra == "langchain"
Requires-Dist: langchain-core>=0.1.0; extra == "langchain"
Provides-Extra: integrations
Requires-Dist: langgraph>=0.0.40; extra == "integrations"
Requires-Dist: langchain-core>=0.1.0; extra == "integrations"
Requires-Dist: crewai>=0.28.0; extra == "integrations"
Requires-Dist: pyautogen>=0.2.0; extra == "integrations"
Provides-Extra: mem0
Requires-Dist: mem0ai>=0.1.0; extra == "mem0"
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Requires-Dist: pre-commit>=3.4.0; extra == "dev"
Requires-Dist: ipykernel>=6.25.0; extra == "dev"
Requires-Dist: jupyter>=1.0.0; extra == "dev"
Provides-Extra: eval
Requires-Dist: pandas>=2.0.0; extra == "eval"
Requires-Dist: matplotlib>=3.7.0; extra == "eval"
Requires-Dist: seaborn>=0.12.0; extra == "eval"
Requires-Dist: wandb>=0.15.0; extra == "eval"
Requires-Dist: scipy>=1.11.0; extra == "eval"
Provides-Extra: all
Requires-Dist: mcal[dev,eval,integrations]; extra == "all"
Dynamic: license-file

# MCAL: Memory-Context Alignment Layer

> **Intent-Preserving Memory for Goal-Coherent AI Agents**

[![PyPI](https://img.shields.io/pypi/v/mcal-ai.svg)](https://pypi.org/project/mcal-ai/)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Why MCAL?

Current AI memory systems store **facts** but lose **meaning**:

| What's Stored | What's Lost |
|---------------|-------------|
| "User chose PostgreSQL" | **WHY** they chose it over MongoDB |
| "User wants to visit Japan" | **HOW** this fits their overall travel goals |

MCAL preserves the **reasoning behind decisions**, not just the conclusions.

## Installation

```bash
pip install mcal-ai
```

**Framework integrations:**
```bash
pip install mcal-ai-langgraph  # LangGraph integration
pip install mcal-ai-crewai     # CrewAI integration  
pip install mcal-ai-autogen    # AutoGen integration
```

## Quick Start

```python
import asyncio
from mcal import MCAL

async def main():
    mcal = MCAL(llm_provider="anthropic")  # or "openai", "bedrock"
    
    messages = [
        {"role": "user", "content": "I'm building a fraud detection pipeline"},
        {"role": "assistant", "content": "Let's start with data ingestion..."},
        {"role": "user", "content": "I chose PostgreSQL over MongoDB for storage"},
    ]
    
    # Extract goals, decisions, and reasoning
    result = await mcal.add(messages, user_id="user_123")
    print(f"Extracted {result.unified_graph.node_count} nodes")
    
    # Search with goal-aware retrieval
    results = await mcal.search("What database?", user_id="user_123")
    
    # Get context for LLM prompts
    context = await mcal.get_context("What's next?", user_id="user_123")

asyncio.run(main())
```

## Key Features

- **Intent Graph** — Hierarchical goal structures (Mission → Goal → Task)
- **Reasoning Chains** — Store WHY decisions were made, not just conclusions
- **Goal-Aware Retrieval** — Retrieve based on objective alignment, not just similarity
- **Extraction Profiles** — `decision` (rationale/alternatives/trade-offs), `conversational` (preferences/relationships), `comprehensive` (both)
- **Hybrid Retrieval** — Graph traversal + ChunkStore embedding search for maximum recall
- **Multi-Provider** — Works with Anthropic, OpenAI, and AWS Bedrock
- **Standalone Storage** — Built-in JSON and SQLite persistence, no external services needed
- **Thread-Safe** — Safe for concurrent multi-user access
- **Tiered Models** — Fast/smart model routing for cost-efficient extraction
- **Extraction Cache** — Skip redundant LLM calls with per-turn cache hits
- **Graph Compaction** — Automatic deduplication with FACT/PERSON node protection
- **GDPR-Ready** — Full user data erasure with `clear_user_data()`

## Configuration

```python
mcal = MCAL(
    llm_provider="bedrock",              # "openai", "anthropic", or "bedrock"
    anthropic_api_key="sk-ant-...",       # Or set ANTHROPIC_API_KEY env var
    openai_api_key="sk-...",              # Or set OPENAI_API_KEY env var
    storage_path="~/.mcal",              # Persistent storage location
    enable_persistence=True,              # Cross-session persistence (default)
    max_graph_nodes=500,                  # Max nodes per user graph
    # Extraction profiles — choose the right depth for your domain
    extraction_profile="decision",        # "decision", "conversational", "comprehensive"
    # Hybrid retrieval — embedding search over raw text chunks
    enable_chunk_store=True,              # Combine graph + chunk retrieval
    chunk_size=512,                       # Tokens per chunk
    chunk_overlap=64,                     # Overlap between chunks
    max_chunks_per_user=1000,             # Max chunks stored per user
    # Bedrock options
    bedrock_model="llama-3.3-70b",        # Default extraction model
    bedrock_region="us-east-1",           # AWS region
    # Tiered model routing (fast for simple, smart for complex)
    enable_tiered_extraction=True,
    bedrock_fast_model="llama-3.1-8b",
    bedrock_smart_model="llama-3.3-70b",
    # Extraction cache
    enable_extraction_cache=True,         # Skip repeated LLM calls (default)
    cache_ttl_seconds=86400,              # Cache lifetime (default: 24h)
    # Graph compaction
    compaction_policy="moderate",         # "none", "moderate", or "aggressive"
    compaction_interval=10,               # Compact every N turns
)
```

## LangGraph Integration

```python
from mcal import MCAL
from mcal_langgraph import MCALMemory, MCALMemoryConfig, MCALStore

# Declarative config — all Bedrock/tiered params supported
config = MCALMemoryConfig(
    llm_provider="bedrock",
    bedrock_model="llama-3.3-70b",
    bedrock_region="us-east-1",
    enable_tiered_extraction=True,
    enable_persistence=True,
)

# Or create directly
mcal = MCAL(llm_provider="bedrock")
memory = MCALMemory(mcal=mcal, user_id="user_123")

# Use as LangGraph BaseStore
store = MCALStore(mcal)

# Memory accepts both LangChain Message objects and plain dicts
await memory.add([
    {"role": "user", "content": "We chose Kafka for event streaming"},
    {"role": "assistant", "content": "Good choice for high throughput."},
])

# Search returns goal-aware results
results = await memory.search("streaming architecture")
```

## User Data Management

```python
# Full user data erasure (GDPR Article 17 compliant)
await mcal.clear_user_data("user_123")
# Removes all files, graphs, caches, and in-memory state for the user

# In-memory-only mode — zero disk writes
mcal = MCAL(
    llm_provider="openai",
    storage_path="/tmp/session",
    enable_persistence=False,   # No files written to disk
)
```

## Environment Variables

```bash
# Choose your LLM provider
ANTHROPIC_API_KEY=sk-ant-...    # For Claude
OPENAI_API_KEY=sk-...           # For GPT-4 / embeddings

# Optional: AWS Bedrock
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_DEFAULT_REGION=us-east-1
```

MCAL auto-detects API keys from environment variables when not passed explicitly.

## What's New in 0.2.8

- **Configurable Extraction Profiles** (#139) — Choose `decision`, `conversational`, or `comprehensive` profiles to optimize extraction for your domain. Decision profile achieves **93.3% DRR** on CTO advisory benchmarks, beating Mem0's 91.1%
- **Hybrid Retrieval with ChunkStore** (#138) — Optional `enable_chunk_store=True` adds embedding-based retrieval over raw text chunks alongside graph search, boosting recall by 28% on LoCoMo benchmarks
- **FACT/PERSON Node Protection** (#140) — Graph compaction now preserves FACT and PERSON nodes that anchoring information (numeric values, names, roles), preventing data loss during merges
- **Benchmark Results** — MCAL Decision profile: 93.3% DRR, 62.2% token reduction, 12-14x faster than Mem0 across all profiles

## What's New in 0.2.7

- **GDPR-compliant data erasure** — `clear_user_data()` now removes the entire user directory from disk, not just the graph file
- **Dict-format message support** — `MCALMemory` accepts plain dicts (`{"role": "user", "content": "..."}`) alongside LangChain Message objects
- **In-memory-only mode** — `enable_persistence=False` guarantees zero disk writes, including extraction cache
- **Full `MCALMemoryConfig`** — Declarative config now supports all Bedrock, tiered model, persistence, and cache parameters
- **Tiered routing accuracy** — Complexity classifier now routes on raw user messages for reliable fast/smart splits
- **Extraction cache per-turn support** — Cache hits work with agents that pass one turn at a time (not full history)
- **Deduplication improvements** — Enhanced label normalization and forced embedding loads for reliable semantic merges

## Documentation

- [GitHub Repository](https://github.com/Shivakoreddi/mcal-ai)
- [Design Document](https://github.com/Shivakoreddi/mcal-ai/blob/main/docs/MCAL_DESIGN.md)
- [LLM Selection Guide](https://github.com/Shivakoreddi/mcal-ai/blob/main/docs/LLM_SELECTION_GUIDE.md)
- [LangGraph Integration](https://github.com/Shivakoreddi/mcal-ai/blob/main/docs/integrations/langgraph.md)

## License

MIT License — see [LICENSE](https://github.com/Shivakoreddi/mcal-ai/blob/main/LICENSE) for details.

## Author

Created by [Shiva Koreddi](https://github.com/Shivakoreddi)
