Metadata-Version: 2.4
Name: discord_fetch
Version: 0.0.11
Summary: Fetch message history from discord for LLMs
Home-page: https://github.com/hamelsmu/discord_fetch
Author: Hamel Husain
Author-email: hamel.husain@gmail.com
License: Apache Software License 2.0
Keywords: nbdev jupyter notebook python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastcore
Requires-Dist: dotenv
Requires-Dist: discord-py
Requires-Dist: typer
Provides-Extra: dev
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Discord Fetch


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Why Discord Fetch?

Discord conversations contain valuable knowledge, but Discord’s UI makes
it hard to: - Export community discussions for analysis - Archive
important conversations before they’re lost  
- Process messages with LLMs or data tools - Search across multiple
channels efficiently

Discord Fetch solves this by providing a simple interactive CLI that
exports Discord messages to clean JSON files, removing all the
complexity of Discord’s API.

## Installation

``` bash
pip install discord-fetch
```

Or install from source:

``` bash
git clone https://github.com/hamelsmu/discord_fetch.git
cd discord_fetch
pip install -e .
```

> [!NOTE]
>
> ### Discord Bot Setup
>
> Before using this tool, you need to create a Discord bot and obtain a
> token:
>
> ### 1. Create a Discord Application
>
> 1.  Go to the [Discord Developer
>     Portal](https://discord.com/developers/applications)
> 2.  Click “New Application” and give it a name
> 3.  Navigate to the “Bot” section in the sidebar
> 4.  Click “Add Bot”
> 5.  Under the “Token” section, click “Copy” to get your bot token
>
> ### 2. Required Bot Permissions
>
> Your bot needs the following permissions: - **View Channels** - To see
> the channels in the server - **Read Message History** - To fetch
> historical messages - **Read Messages/View Channels** - Basic read
> access
>
> ### 3. Bot Scopes and OAuth2
>
> When inviting your bot to a server, use these scopes: - `bot` - Basic
> bot permissions - `applications.commands` (optional) - If you plan to
> add slash commands
>
> ### 4. Invite the Bot to Your Server
>
> 1.  In the Discord Developer Portal, go to “OAuth2” \> “URL Generator”
> 2.  Select the `bot` scope
> 3.  Select the required permissions listed above
> 4.  Copy the generated URL and open it in your browser
> 5.  Select the server you want to add the bot to
>
> ### 5. Environment Setup
>
> Create this environment variable
>
> ``` env
> DISCORD_TOKEN=your_bot_token_here
> ```

## Getting Started in 2 Minutes

Once you’ve set up your Discord bot (see setup section below), using
Discord Fetch is simple:

``` bash
# Install the tool
pip install discord-fetch

# Set your bot token
export DISCORD_TOKEN=your_bot_token_here

# Run the interactive CLI
discord-fetch
```

The CLI will guide you through:

1.  **Selecting a Discord server** from your bot’s servers
2.  **Choosing channels** to export (one, multiple, or all)  
3.  **Picking output format** (single combined file or separate files)
4.  **Watching progress** with live progress bars

That’s it! Your Discord messages are now in clean JSON files ready for
processing.

**Live example**

## What You Can Do With Discord Fetch

### 🎯 For Quick Channel Exports

Use the interactive CLI when you want to: - Export specific channels
from a Discord server - Archive channels before they’re deleted - Get
data for one-time analysis

**Example**: Export your project’s \#general and \#dev channels

``` bash
discord-fetch
# Select your server, choose specific channels, export to JSON
```

### 📊 For Bulk Server Exports

Perfect when you need to: - Archive an entire Discord server - Migrate
community knowledge to another platform - Create regular backups of all
channels

**Example**: Export all 84 channels from a large community

``` bash
discord-fetch
# Select server, choose "ALL CHANNELS", save to directory
```

### 🔧 For Custom Integrations

Use the Python API when you need: - Automated exports on a schedule -
Custom filtering or processing - Integration with other Python tools

**Example**: Auto-export new messages daily

``` python
from discord_fetch.core import fetch_messages_since_date
from datetime import datetime, timedelta

# Fetch only messages from last 24 hours
yesterday = datetime.now() - timedelta(days=1)
messages = await fetch_messages_since_date(channel_id, yesterday)
```

## See It In Action

Here’s what the interactive CLI looks like:

    $ discord-fetch

    Discord Channel Fetcher

    Available Guilds:
    1. AI Evals For Engineers & Technical PMs (84 channels)
    2. LangChain Community (126 channels)
    3. OpenAI DevDay (45 channels)

    Select guild number [1]: 1

    Channels in AI Evals For Engineers & Technical PMs:
    ┏━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
    ┃ #  ┃ Channel                        ┃ Category         ┃ ID                ┃
    ┡━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
    │ 0  │ ALL CHANNELS                   │ —                │ —                 │
    │ 1  │ general                        │ Text Channels    │ 136866639069682...│
    │ 2  │ announcements                  │ Text Channels    │ 137615200096996...│
    │ 3  │ introductions                  │ Text Channels    │ 137655886438386...│
    │ 4  │ course-discussions             │ Course           │ 138234567891234...│
    └────┴────────────────────────────────┴──────────────────┴───────────────────┘

    Select channel number (0 for all) [0]: 0

    Selected: All 84 channels

    Output Options:
    1. Concatenate to single file
    2. Separate files in directory

    Select output option [1]: 2

    Enter output directory [./discord_output]: ./ai-evals-export

    #general                      ⠸ ████████████████████████████████████████ 100/100 ✓ Complete
    #announcements                ⠼ ████████████████████████████████████████ 100/100 ✓ Complete  
    #introductions                ⠴ ████████████████████████████████████████ 100/100 ✓ Complete
    #course-discussions           ⠦ ██████████████████░░░░░░░░░░░░░░░░░░░░░  47/100 Fetching...
    Total                         ⠧ ██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░   4/84  4/84 completed

    ✓ Saved 84 files to ./ai-evals-export

    Successfully fetched 84 out of 84 channels!

## Which Tool Should I Use?

### Use `discord-fetch` (Interactive CLI) when:

✅ You want a guided experience with visual feedback  
✅ You’re exporting channels for analysis or archival  
✅ You need to browse and select from available channels  
✅ You’re not familiar with Discord channel IDs

### Use [`fetch_discord_msgs`](https://hamelsmu.github.io/discord_fetch/core.html#fetch_discord_msgs) (Direct CLI) when:

✅ You know the exact channel ID to export  
✅ You want to pipe output directly to another tool  
✅ You’re scripting or automating exports  
✅ You need stdout output for processing

### Use the Python API when:

✅ You need custom filtering or processing logic  
✅ You’re building a larger application  
✅ You want to fetch only new messages since a date  
✅ You need programmatic access to the data

**Quick decision**: If you’re unsure, start with `discord-fetch` - it’s
the easiest way to get started!

## Advanced Python API

For developers who need programmatic access:

### Basic Channel Export

``` python
from discord_fetch.core import fetch_discord_msgs
import asyncio

# Export a single channel
channel_id = 1369370266899185746
original, simplified = await fetch_discord_msgs(
    channel_id, 
    save_original=False,    # Don't save to file
    save_simplified=False,  # Don't save to file  
    print_summary=False     # Silent operation
)
```

### List All Available Channels

``` python
from discord_fetch.core import list_all_channels

# Get all channels your bot can see
channels = await list_all_channels()
for ch in channels:
    print(f"{ch['guild_name']} - #{ch['channel_name']} ({ch['channel_id']})")
```

### Fetch Only New Messages

``` python
from discord_fetch.core import fetch_messages_since_date
from datetime import datetime, timedelta

# Get messages from last 7 days
last_week = datetime.now() - timedelta(days=7)
recent = await fetch_messages_since_date(channel_id, last_week)
```

### Find Active Channels

``` python
from discord_fetch.core import list_channels_with_new_messages

# Find channels with activity since a date
active = await list_channels_with_new_messages("01-01-2024")
for guild, data in active.items():
    for ch in data['channels_with_new_messages']:
        print(f"{ch['channel_name']}: {ch['new_message_count']} new messages")
```

## Understanding the Output Format

Discord Fetch simplifies complex Discord data into clean JSON:

### Original Format (with metadata)

``` json
{
  "channel_info": {...},
  "messages": [
    {
      "id": "123456789",
      "author": {...},
      "content": "Message text",
      "timestamp": "2024-01-15T10:30:00",
      "attachments": [...],
      "reactions": [...],
      "reply_to": {...}
    }
  ],
  "threads": {...}
}
```

### Simplified Format (for processing)

``` json
{
  "channel": "general",
  "conversations": [
    {
      "main_message": {
        "author": "alice",
        "content": "Has anyone tried the new API?"
      },
      "replies": [
        {
          "author": "bob",
          "content": "Yes! It's much faster now"
        },
        {
          "author": "charlie",
          "content": "Agreed, the response time improved by 50%"
        }
      ]
    }
  ]
}
```

The simplified format: - Groups related messages into conversations -
Removes IDs, timestamps, and metadata - Reduces file size by ~75% -
Perfect for LLMs and analysis tools
