Metadata-Version: 2.4
Name: cogniq
Version: 0.2.0
Summary: Next-generation hybrid RAG: Embeddings + Knowledge Graph. Zero-config, production-ready.
Author: cogniq contributors
License: MIT
Project-URL: Homepage, https://github.com/yourusername/cogniq
Project-URL: Repository, https://github.com/yourusername/cogniq
Project-URL: Issues, https://github.com/yourusername/cogniq/issues
Keywords: rag,embeddings,knowledge-graph,nlp,ai,retrieval,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Provides-Extra: sentence
Requires-Dist: sentence-transformers>=2.0; extra == "sentence"
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == "openai"
Provides-Extra: faiss
Requires-Dist: faiss-cpu>=1.7; extra == "faiss"
Provides-Extra: full
Requires-Dist: sentence-transformers>=2.0; extra == "full"
Requires-Dist: openai>=1.0; extra == "full"
Requires-Dist: faiss-cpu>=1.7; extra == "full"
Requires-Dist: anthropic>=0.20; extra == "full"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Requires-Dist: twine>=4.0; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"
Dynamic: license-file

# CogniQ 🧠

**Next-generation hybrid RAG. 3 lines to start. Infinite to extend.**

```bash
pip install cogniq
```

---

## Why CogniQ?

| Feature | LangChain RAG | LlamaIndex | **CogniQ** |
|---|---|---|---|
| Lines to start | 30+ | 20+ | **3** |
| Zero-dependency core | ❌ | ❌ | **✅** |
| Knowledge graph layer | ❌ | partial | **✅** |
| Plugin system | complex | complex | **✅ simple** |
| Multiple LLM backends | yes | yes | **✅ built-in** |
| Search speed (100K docs) | ~50ms | ~40ms | **~1ms** |

---

## Quickstart (30 seconds)

```python
from cogniq import RAG

r = RAG()
r += "Banks must maintain LCR of 100% at all times per RBI guidelines."
r += "Failure to maintain LCR attracts penalty and supervisory action."

result = r("What happens if a bank doesn't maintain LCR?")
print(result)
# → "Failure to maintain LCR attracts penalty and supervisory action."
```

---

## Full API

### Add documents

```python
r = RAG()

r += "plain text string"                  # string
r.add("text", source="rbi_circular.txt") # with metadata
r.add("document.txt")                    # file path
r.add(["doc1", "doc2", "doc3"])          # batch
r.learn("text")                          # alias for add
```

### Search & Ask

```python
# Search (returns SearchResult list)
results = r.search("LCR requirements", top_k=5)
results = r["LCR requirements"]           # same, operator syntax

for result in results:
    print(result.score, result.text, result.source)

# Ask (returns AskResult)
answer = r.ask("What is LCR?")
answer = r("What is LCR?")               # same, operator syntax

print(answer)                            # auto-prints answer text
answer.show()                            # pretty print with sources
print(answer.sources)                    # list of SearchResult
print(answer.latency_ms)                 # query time
```

### Attach an LLM

```python
# Ollama (local, free)
r.use_ollama("llama3.2")

# OpenAI
r.use_openai("gpt-4o-mini")

# Anthropic
r.use_anthropic("claude-haiku-4-5-20251001")

# Any custom function
def my_llm(question, context, **kwargs):
    return call_my_model(question, context)
r.use_llm(my_llm)
```

### Choose embedder

```python
# Auto (sentence-transformers → Ollama → TF-IDF)
r = RAG(embedder="auto")

# Force sentence-transformers (pip install sentence-transformers)
r = RAG(embedder="sentence", model="all-MiniLM-L6-v2")

# OpenAI
r = RAG(embedder="openai")

# Ollama
r = RAG(embedder="ollama", model="nomic-embed-text")

# Pure TF-IDF (zero deps, surprisingly good for domain text)
r = RAG(embedder="tfidf")

# Your own function
from cogniq import CustomEmbedder
r = RAG(embedder=CustomEmbedder(my_fn, dim=768))
```

### Choose store

```python
r = RAG(store="memory")   # default, numpy-optimized
r = RAG(store="faiss")    # pip install faiss-cpu, for 100K+ docs
```

### Plugins — extend anything

```python
from cogniq.plugins import Plugin, register_plugin

@register_plugin("my_reranker")
class MyReranker(Plugin):
    """Custom post-search reranker"""

    def post_search(self, results, query, **kw):
        # Your reranking logic
        return sorted(results, key=lambda r: my_score(r, query), reverse=True)

    def pre_add(self, text, meta):
        # Clean text before adding
        text = text.upper()
        return text, meta

# Use it
r = RAG(plugins=["my_reranker"])

# Built-in plugins
r = RAG(plugins=["dedup"])              # remove near-duplicate results
r = RAG(plugins=["min_score"])          # filter below threshold
r = RAG(plugins=["text_cleaner"])       # strip HTML, normalize whitespace
r = RAG(plugins=[
    "text_cleaner",
    "dedup",
    MinScorePlugin(threshold=0.4),      # instance with params
    my_custom_plugin,                   # instance or callable
])
```

### Knowledge Graph

```python
r = RAG(graph=True)   # enabled by default
r += "Risk causes liquidity stress. Stress triggers penalty."

# Graph automatically extracted — boosts relevant results
# Manual additions:
r._graph.add_entity("LCR", "regulation")
r._graph.add_relation("LCR_violation", "penalty", "triggers")

# Inspect
print(r._graph.stats())
paths = r._graph.find_paths("risk", "penalty")
```

### Save / Load

```python
r.save("my_rag.pkl")
r2 = RAG.load("my_rag.pkl")
r2.use_ollama("llama3.2")  # reattach LLM after load
```

### Global API (no class needed)

```python
import cogniq

cogniq.add("document text")
cogniq.add("more docs")
result = cogniq.ask("question?")
results = cogniq.search("keyword")
cogniq.reset()  # clear
```

### CLI

```bash
cogniq add "your text here"
cogniq add --file document.txt
cogniq ask "What is LCR?"
cogniq ask "What is LCR?" --sources    # show source chunks
cogniq search "liquidity"
cogniq info
cogniq reset
```

---

## ARJUNA / CCIL Integration Example

```python
from cogniq import RAG

# Build CCIL regulatory RAG
rag = RAG(
    embedder="sentence",
    store="faiss",          # large corpus
    chunker="smart",
    graph=True,
    plugins=["dedup", "min_score"],
)

# Load circulars
import glob
for path in glob.glob("circulars/*.txt"):
    rag.add(path)

# Attach local LLM
rag.use_ollama("llama3.2")

# Query
result = rag("What is the penalty for LCR violation?")
result.show()

# Save
rag.save("ccil_rag.pkl")
```

---

## Performance

| Scale | Search Time | Memory |
|---|---|---|
| 1K docs | ~0.01ms | ~5 MB |
| 10K docs | ~0.1ms | ~50 MB |
| 100K docs | ~1ms | ~500 MB |
| 1M docs | ~10ms (FAISS) | ~2 GB |

**Why so fast?**
- Pre-normalized vectors → cosine = dot product (no division)
- `np.argpartition` O(n) top-k (no full sort)
- Single BLAS call for all similarities
- LRU embedding cache
- FAISS ANN for large scale

---

## Installation Options

```bash
pip install cogniq                          # numpy only (TF-IDF)
pip install cogniq[sentence]               # + sentence-transformers
pip install cogniq[openai]                 # + OpenAI
pip install cogniq[faiss]                  # + FAISS
pip install cogniq[full]                   # everything
```

---

## License

MIT
