Metadata-Version: 2.4
Name: meeto
Version: 0.3.2
Summary: Google Meet bot for meeting transcription, recording, and speech-to-text. Join meetings automatically, capture audio, and transcribe in real time with speaker attribution.
Author-email: Shivansh Vishwakarma <shivansh.vishwakarma@researchify.io>
License: MIT
Project-URL: Homepage, https://github.com/ResearchifyLabs/meeto
Project-URL: Repository, https://github.com/ResearchifyLabs/meeto
Project-URL: Documentation, https://github.com/ResearchifyLabs/meeto#readme
Project-URL: Changelog, https://github.com/ResearchifyLabs/meeto/blob/main/CHANGELOG.md
Project-URL: Issues, https://github.com/ResearchifyLabs/meeto/issues
Keywords: google-meet,meeting-bot,transcription,speech-to-text,meeting-transcription,meeting-recorder,stt,diarization,speaker-attribution,playwright,automation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Communications :: Conferencing
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Multimedia :: Video :: Capture
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: playwright>=1.40
Requires-Dist: websockets>=12.0
Provides-Extra: dev
Requires-Dist: ruff; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: build; extra == "dev"
Dynamic: license-file

# meeto

[![CI](https://github.com/ResearchifyLabs/meeto/actions/workflows/ci.yml/badge.svg)](https://github.com/ResearchifyLabs/meeto/actions/workflows/ci.yml)
[![PyPI](https://img.shields.io/pypi/v/meeto)](https://pypi.org/project/meeto/)
[![Python](https://img.shields.io/pypi/pyversions/meeto)](https://pypi.org/project/meeto/)
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

Open-source Google Meet bot. Join meetings, capture audio, and transcribe in real time using Playwright and pluggable STT providers.

## Installation

```bash
pip install meeto
playwright install chromium
```

## Quick Start

The simplest way to use meeto is guest mode — no Google account needed. The bot joins the meeting as a guest and waits for the host to admit it.

```python
import asyncio
from meeto import run_meeting_worker
from meeto.config import WorkerConfig, JoinConfig, SttConfig

config = WorkerConfig(
    meeting_id="my-meeting-001",
    meet_url="https://meet.google.com/abc-defg-hij",
    join=JoinConfig(
        bot_name="Meeto Bot",
    ),
    stt=SttConfig(
        provider="deepgram",
        api_key="YOUR_DEEPGRAM_API_KEY",
    ),
)

asyncio.run(run_meeting_worker(config))
```

Or via the CLI example:

```bash
PYTHONPATH=src python scripts/example.py https://meet.google.com/abc-defg-hij --bot-name "Meeto Bot" --no-headless
```

Audio dumps are saved to `./audio/` and transcripts to `./transcripts/` by default.

> **Note:** Guest mode requires a display — see [Deployment](#deployment) for running on servers and containers.

## Authenticated Mode

To join meetings as a Google account (no waiting room), generate a browser session first:

```bash
meeto-auth --output storage_state.json
```

This opens a Chromium window. Log in to Google, then press Enter in the terminal.

```python
config = WorkerConfig(
    meeting_id="my-meeting-001",
    meet_url="https://meet.google.com/abc-defg-hij",
    duration_seconds=3600,
    join=JoinConfig(
        headless=True,
        storage_state_path="storage_state.json",
    ),
    stt=SttConfig(
        provider="deepgram",
        api_key="YOUR_DEEPGRAM_API_KEY",
    ),
)

asyncio.run(run_meeting_worker(config))
```

## Deployment

### Guest Mode on Servers / Containers

Google blocks headless browsers from joining meetings via server-side bot detection. To run guest mode in a headless environment, use [Xvfb](https://www.x.org/releases/X11R7.6/doc/man/man1/Xvfb.1.xhtml) to provide a virtual display. meeto automatically detects the `DISPLAY` environment variable and switches to headed mode.

**Quick option** — wrap your process with `xvfb-run`:

```bash
apt-get install -y xvfb
xvfb-run python your_bot_script.py
```

**Docker** — start Xvfb in your entrypoint:

```dockerfile
RUN apt-get update && apt-get install -y xvfb
ENV DISPLAY=:99
```

```bash
#!/bin/bash
Xvfb :99 -screen 0 1920x1080x24 &
sleep 1
exec python your_bot_script.py
```

### Authenticated Mode on Servers / Containers

Authenticated mode uses a saved Google session (`storage_state.json`), which Google trusts as a real user. This means it works with `headless=True` out of the box — no Xvfb, no virtual display, no extra system dependencies.

```python
config = WorkerConfig(
    meeting_id="my-meeting-001",
    meet_url="https://meet.google.com/abc-defg-hij",
    join=JoinConfig(
        headless=True,
        storage_state_path="storage_state.json",
    ),
)
```

Generate `storage_state.json` once on a machine with a display (your laptop), then copy it to the server or bake it into your deployment secrets.

### Environment Variable Config

For container/job deployments, build config from env vars:

```python
from meeto.config import worker_config_from_env

config = worker_config_from_env()
```

Required: `MEETING_ID`, `MEET_URL`. See `meeto/config/env_config.py` for the full list.

## Configuration

| Field | Type | Default | Description |
|---|---|---|---|
| `meeting_id` | `str` | required | Unique identifier for the meeting |
| `meet_url` | `str` | required | Google Meet URL |
| `duration_seconds` | `int` | `3600` | Max recording duration |
| `audio` | `AudioConfig` | defaults | Audio capture settings |
| `stt` | `SttConfig` | defaults | Speech-to-text settings |
| `join` | `JoinConfig` | defaults | Browser join settings |

## Extending

### Custom Storage Adapter

By default, artifacts stay local. To upload to cloud storage (S3, GCS, etc.), implement `ArtifactStorageAdapter`:

```python
from meeto.storage import ArtifactStorageAdapter

class S3StorageAdapter(ArtifactStorageAdapter):
    def upload(self, local_path, content_type="application/octet-stream"):
        return f"s3://my-bucket/{local_path}"

asyncio.run(run_meeting_worker(config, storage_adapter=S3StorageAdapter()))
```

### Custom Meeting State Store

By default, meeting lifecycle state is kept in memory. To persist state (e.g. to a database), implement `MeetingLifecycleStore`:

```python
from meeto.state_store import MeetingLifecycleStore

class PostgresMeetingStore(MeetingLifecycleStore):
    def update_status(self, meeting_id, *, status, ended_at=None, transcription_path=None):
        ...

    def heartbeat(self, meeting_id):
        ...

asyncio.run(run_meeting_worker(config, state_store=PostgresMeetingStore()))
```

### Custom STT Provider

Meeto ships with Deepgram support. To add your own STT provider, implement `STTStreamingAdapter`:

```python
from meeto.stt import STTStreamingAdapter, register_stt

class MySTTAdapter(STTStreamingAdapter):
    async def connect(self): ...
    async def send_audio(self, pcm_bytes): ...
    async def start(self, on_segment): ...
    async def close(self): ...

register_stt("my_provider", MySTTAdapter)
```

Then set `SttConfig(provider="my_provider")`.

## Logging

meeto uses Python's standard `logging` module and does not configure any handlers itself. To see logs, configure logging in your application:

```python
import logging
logging.basicConfig(level=logging.INFO)
```

For finer control:

```python
logging.getLogger("meeto").setLevel(logging.DEBUG)         # all meeto logs
logging.getLogger("meeto.stt").setLevel(logging.WARNING)   # quieter STT logs
```

## Architecture

```
meeto/
├── runtime.py              # Main entrypoint: run_meeting_worker()
├── pipeline.py             # Wires audio capture, STT, and transcript writing
├── storage.py              # ArtifactStorageAdapter ABC + LocalStorageAdapter
├── audio_writer.py         # PCM audio dump writer
├── transcript_writer.py    # JSONL transcript writer
├── auth/                   # Google login session generator
├── config/                 # Typed config models + env var parser
├── meet/                   # Playwright-based meeting join, end detection, speaker tracking
├── state_store/            # Meeting lifecycle state management
└── stt/                    # STT adapter interface + Deepgram implementation
```

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, code style, and PR guidelines.

## License

MIT
