Metadata-Version: 2.4
Name: xark
Version: 0.2.0
Summary: CLI tool to scrape bookmarked X Articles and export to offline Markdown
Project-URL: Homepage, https://github.com/A1M/xark
Project-URL: Repository, https://github.com/A1M/xark
Project-URL: Issues, https://github.com/A1M/xark/issues
Author: A1M
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Utilities
Requires-Python: >=3.11
Requires-Dist: beautifulsoup4
Requires-Dist: lxml
Requires-Dist: markdownify
Requires-Dist: playwright
Requires-Dist: python-slugify
Requires-Dist: pyyaml
Requires-Dist: typer
Description-Content-Type: text/markdown

# xark

CLI tool that scrapes your bookmarked X/Twitter Articles and exports them to offline Markdown files with images. Useful for archiving long-form content before bookmarks disappear.

## Install

```bash
pipx install xark
xark setup
```

`xark setup` installs the Chromium browser that Playwright needs for scraping (~300MB, one-time).

## Quick start

```bash
xark export --headful
```

This opens a browser window, logs into your X session (using your existing cookies), scrolls through your bookmarks, and exports any Articles it finds as Markdown files.

## Commands

### `xark export`

Scrape bookmarked Articles and export to Markdown.

```bash
xark export --headful              # visible browser (recommended for first run)
xark export --limit 10             # scan only 10 bookmarks
xark export --output ./my-archive  # custom output directory
xark export --dry-run              # classify bookmarks without writing files
xark export --no-save-images       # skip image downloads
```

### `xark organize`

Reorganize previously exported articles into topic subfolders.

```bash
xark organize                      # auto-categorize articles
xark organize --dry-run            # preview without moving files
```

### `xark setup`

Install the required Chromium browser for scraping.

```bash
xark setup
```

## Prerequisites

- Python 3.11+
- An X/Twitter account with bookmarked Articles
- A logged-in X session in your browser (the tool uses your existing session)

## How it works

1. **Collect** -- scrolls your X bookmarks feed and captures bookmark cards
2. **Classify** -- identifies which bookmarks are Articles vs. regular posts
3. **Extract** -- navigates to each Article and extracts content, images, and metadata
4. **Export** -- writes Markdown files with YAML frontmatter and downloads images locally

Output structure:

```
x-articles-archive/
  articles/
    tech/
      2026-03-14-some-article-title.md
    business/
      2026-03-13-another-article.md
  assets/
    2026-03-14-some-article-title/
      image-01.jpg
  raw/
    2026-03-14-some-article-title.html
```

## License

MIT
