Metadata-Version: 2.4
Name: notion-bulk-export
Version: 0.1.0
Summary: Fast bulk export for Notion databases with full page content using asyncio parallel requests
Author: yunjeongiya
License-Expression: MIT
Project-URL: Homepage, https://github.com/yunjeongiya/notion-bulk-export
Project-URL: Repository, https://github.com/yunjeongiya/notion-bulk-export
Project-URL: Issues, https://github.com/yunjeongiya/notion-bulk-export/issues
Keywords: notion,export,asyncio,bulk,database
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiohttp>=3.9.0
Dynamic: license-file

# notion-bulk-export

Fast bulk export for Notion databases with full page content.

Exports all pages from a Notion database to JSON/CSV, including the **full page body** (not just properties). Uses asyncio parallel requests to handle thousands of pages efficiently.

## Why?

Notion's official export often fails on large databases, and the API has strict rate limits. This tool handles both problems:

- **Parallel requests** (configurable concurrency) for speed
- **Exponential backoff** for automatic rate limit recovery
- **Checkpoint & resume** so you never lose progress on large exports

## Quick Start

```bash
pip install notion-bulk-export
```

```bash
notion-bulk-export --database-id YOUR_DATABASE_ID
```

You'll need a [Notion integration token](https://www.notion.so/my-integrations). Either pass it with `--token` or set the `NOTION_TOKEN` environment variable.

## Usage

```bash
# Basic export (JSON with page content)
notion-bulk-export --database-id abc123 --token ntn_xxx

# Use environment variable for token
export NOTION_TOKEN=ntn_xxx
notion-bulk-export --database-id abc123

# Export to specific directory
notion-bulk-export --database-id abc123 --output ./my-export

# Export as both JSON and CSV
notion-bulk-export --database-id abc123 --format json csv

# Metadata only (skip page content for faster export)
notion-bulk-export --database-id abc123 --no-content

# Adjust concurrency (default: 5)
notion-bulk-export --database-id abc123 --concurrency 3
```

## Options

| Option | Default | Description |
|--------|---------|-------------|
| `--database-id` | *required* | Notion database ID (from the URL) |
| `--token` | `$NOTION_TOKEN` | Notion integration token |
| `--output` | `./output` | Output directory |
| `--format` | `json` | Output format(s): `json`, `csv`, or both |
| `--concurrency` | `5` | Number of parallel API requests |
| `--no-content` | `false` | Skip page content (export properties only) |

## Output

### JSON

Each page is exported with all properties and full content:

```json
{
  "id": "page-uuid",
  "title": "Page Title",
  "properties": {
    "Status": "In Progress",
    "Tags": ["tag1", "tag2"],
    "Created": "2025-01-15"
  },
  "content": "# Heading\n\nPage body text...",
  "block_count": 12,
  "created_time": "2025-01-15T10:00:00.000Z",
  "last_edited_time": "2025-02-01T15:30:00.000Z"
}
```

### CSV

All properties are auto-detected and become columns. Page content is included as a column with newlines escaped.

## Checkpoint & Resume

For large databases, the tool saves progress every 50 pages. If the export is interrupted (Ctrl+C, network error, etc.), just run the same command again -- it will skip already-exported pages and continue from where it left off.

```
$ notion-bulk-export --database-id abc123
[INFO] Fetching page list...
[OK] Total 3094 pages found
[INFO] Resuming from 1250 already processed   # <-- picks up where you left off
[INFO] 1844 pages remaining
```

## Performance

Tested with a 3,094-page Notion database:

| Method | Speed | Time for 3,094 pages |
|--------|-------|---------------------|
| Sequential API calls | ~10 pages/min | ~5 hours |
| **notion-bulk-export** (concurrency=5) | ~80 pages/min | **~40 minutes** |

## How to Get Your Database ID

1. Open your Notion database in a browser
2. The URL looks like: `https://www.notion.so/workspace/DATABASE_ID?v=...`
3. Copy the 32-character ID (with or without dashes)

## How to Get a Notion Token

1. Go to [notion.so/my-integrations](https://www.notion.so/my-integrations)
2. Create a new integration
3. Copy the token (starts with `ntn_`)
4. **Important**: Share your database with the integration (click "..." menu on the database > "Connections" > add your integration)

## Supported Property Types

All Notion property types are supported:

title, rich_text, number, checkbox, select, multi_select, status, date, people, files, url, email, phone_number, formula, relation, rollup, created_time, last_edited_time, created_by, last_edited_by

## Comparison with Other Tools

| Tool | Output Format | Async | Checkpoint & Resume | All Property Types |
|------|--------------|-------|--------------------|--------------------|
| [notion-exporter](https://pypi.org/project/notion-exporter/) | Markdown | Yes | No | No |
| [python-notion-exporter](https://github.com/Strvm/python-notion-exporter) | Markdown/HTML/PDF | No | No | No |
| [notion4ever](https://github.com/MerkulovDaniil/notion4ever) | Markdown + HTML | No | No | No |
| [notion-exporter (TS)](https://github.com/yannbolliger/notion-exporter) | Markdown + CSV | No | No | No |
| **notion-bulk-export** | **JSON + CSV** | **Yes** | **Yes** | **Yes** |

**Key differences:**

- **Structured data output**: Other tools focus on rendering pages to Markdown. This tool outputs structured JSON/CSV, which is better suited for data analysis, AI/ML pipelines, and migrations.
- **Checkpoint & resume**: No other tool saves progress during export. If your export of 3,000 pages gets interrupted at page 2,500, this tool picks up where it left off instead of starting over.
- **Database-focused**: Most tools are page-tree exporters. This tool is designed specifically for exporting Notion database records with all their properties.

## Requirements

- Python 3.10+
- A Notion integration token with access to the target database

## License

MIT
