Metadata-Version: 2.4
Name: condensr
Version: 0.2.3
Summary: Summarize books chapter by chapter using AI
License-Expression: AGPL-3.0-or-later
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pymupdf>=1.24
Requires-Dist: mistralai<2.0,>=1.0
Requires-Dist: click>=8.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-mock; extra == "dev"
Requires-Dist: commitizen; extra == "dev"
Dynamic: license-file

# Condensr

Summarize books chapter by chapter using AI.

Condensr takes a PDF book as input and produces a structured Markdown summary. It detects chapters automatically and summarizes each one individually (max ~500 words per chapter) using Mistral AI.

![Condensr Screenshot](docs/Screenshot.png)

## Installation

```bash
pip install condensr
```

## Quick Start

### Python API

```python
import condensr

# Preview detected chapters (no API calls)
chapters = condensr.get_chapters("book.pdf")
print(chapters)
# ["Introduction", "Chapter 1: Origins", "Chapter 2: Growth"]

# Summarize chapter by chapter
for title, summary in condensr.summarize("book.pdf"):
    print(f"## {title}\n{summary}\n")
```

### CLI

```bash
condensr book.pdf
# Writes book-summary.md with progress output
```

## API Reference

### `condensr.get_chapters(pdf_path)`

Detect and return chapter titles from a PDF. Uses heuristic detection only (no API calls). Returns an empty list if no chapters are found.

**Parameters:**
- `pdf_path` (str) — Path to the PDF file.

**Returns:** `list[str]` — Chapter titles.

### `condensr.summarize(pdf_path, *, model, api_key, on_chapter)`

Summarize a PDF book chapter by chapter. Returns a generator yielding `(title, summary_markdown)` tuples.

If no chapters are detected, summarizes the entire book as one unit.

**Parameters:**
- `pdf_path` (str) — Path to the PDF file.
- `model` (str) — Mistral model name. Default: `"mistral-small-latest"`.
- `api_key` (str | None) — Mistral API key. Falls back to `MISTRAL_API_KEY` env var.
- `on_chapter` (callable | None) — Optional `callback(title, summary)` fired before each yield.

**Yields:** `tuple[str, str]` — `(chapter_title, summary_markdown)`.

#### Callback Example

```python
def on_chapter(title, summary):
    save_to_db(title, summary)

for title, summary in condensr.summarize("book.pdf", on_chapter=on_chapter):
    display(title, summary)
```

## CLI Reference

```
Usage: condensr [OPTIONS] PDF_PATH

  Summarize a PDF book chapter by chapter.

Options:
  -o, --output PATH  Output file path. Default: <book>-summary.md
  -m, --model TEXT    Mistral model name.
  --help              Show this message and exit.
```

## Configuration

Set your Mistral API key as an environment variable:

```bash
export MISTRAL_API_KEY="your-api-key"
```

Or pass it programmatically:

```python
for title, summary in condensr.summarize("book.pdf", api_key="your-api-key"):
    ...
```

## Privacy Notice

Condensr sends the text content of your PDF to Mistral AI's API servers for summarization. Do not use Condensr with confidential or sensitive documents unless you are comfortable with this data being transmitted to a third-party service. Review [Mistral's privacy policy](https://mistral.ai/terms/) for details on how your data is handled.

## License

AGPL-3.0-or-later

## Release Process

This project uses [Commitizen](https://commitizen-tools.github.io/commitizen/) for automated versioning and [twine](https://twine.readthedocs.io/) for local PyPI publishing.

### How to release a new version

1.  Ensure all your changes are committed on the `main` branch.
2.  Run the bump command to update `CHANGELOG.md` and create a new Git tag:
    ```bash
    cz bump
    ```
3.  Build and upload the package:
    ```bash
    rm -rf dist/ build/
    python -m build
    twine upload dist/*
    ```
4.  Push the changes and new tag to GitLab:
    ```bash
    git push origin main --tags
    ```

For detailed instructions, see [docs/RELEASE_PROCESS.md](docs/RELEASE_PROCESS.md).
