Metadata-Version: 2.4
Name: vspell
Version: 0.1.4
Summary: A simple, command-line voice spelling and transcription tool.
Author-email: Dan Higgins <daniel.higgins@gatech.edu>
License: MIT
Project-URL: Homepage, https://github.com/DanHUMassMed/vspell.git
Requires-Python: >=3.13
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: build>=1.3.0
Requires-Dist: faster-whisper>=1.2.1
Requires-Dist: pyperclip>=1.11.0
Requires-Dist: scipy>=1.16.3
Requires-Dist: sounddevice>=0.5.3
Requires-Dist: twine>=6.2.0
Dynamic: license-file

# VSpell

A simple, command-line voice spelling and transcription tool. Record a short audio clip, get it transcribed by Whisper, and have the text automatically copied to your clipboard.

VSpell listens for a few seconds, transcribes what it hears using the `faster-whisper` library, and copies the resulting text directly to your clipboard, streamlining voice-to-text workflows.

## Features

- **Fast Transcription**: Quickly record and transcribe audio using the highly efficient `faster-whisper` library.
- **Clipboard Integration**: Transcribed text is automatically copied to the clipboard for immediate pasting.
- **Silence Detection**: Avoids processing and transcribing empty audio clips, saving time and resources.
- **Noise Calibration**: Includes a one-time calibration step to accurately distinguish speech from ambient noise.
- **Model Selection**: Choose from different Whisper model sizes (`tiny`, `base`, `small`, `medium`, `large`) to balance speed and accuracy.
- **Audio Playback**: Listen to your last recording to verify what was captured.

## Installation

Before installing, ensure you have the necessary system dependencies for audio recording.

**For macOS:**

```bash
brew install ffmpeg
```

**For Debian/Ubuntu:**

```bash
sudo apt-get install ffmpeg
```

---

To install VSpell, clone this repository and install the package using pip.

```bash
git clone https://github.com/vibe-technologies/vspell.git
cd vspell
pip install .
```

This will install the necessary dependencies and make the `vspell` command available in your terminal.

## First-Time Setup: Calibration

For VSpell to work effectively, it needs to know what "silence" sounds like in your environment. Run the calibration command once before you start using it.

Find a quiet moment and run:

```bash
vspell --calibrate
```

Remain silent for the 2-second duration. This will measure your ambient noise level and set a threshold for silence detection. This value is saved in `~/.config/vspell/vspell_config.json`. You can re-run this anytime your environment changes (e.g., you get a new microphone or move to a noisier room).

## Usage

Once calibrated, using VSpell is simple.

### Main Command

Just run the `vspell` command. It will listen for 2 seconds, transcribe what it hears, and copy the result to your clipboard.

```bash
vspell
```

```
Listening for 2 seconds...
Transcribing…
Transcribed: Hello, world.
Text copied to clipboard.
```

If you say nothing, it will detect the silence and stop.

```bash
vspell
```

```
Listening for 2 seconds...
No speech detected — nothing transcribed.
```

### Command-Line Options

```
usage: vspell [-h] [--calibrate] [--playback [PLAYBACK]] [--duration DURATION] [--model MODEL]

VSpell - Voice spelling tool

options:
  -h, --help            show this help message and exit
  --calibrate           Calibrate ambient noise threshold
  --punctuate           Retain punctuation and original casing in transcribed text (default is to remove punctuation and lowercase)
  --playback [PLAYBACK]
                        Playback recorded audio with optional volume
                        multiplier (default=1.0)
  --duration DURATION   Recording duration in seconds
  --model MODEL         Whisper model size [tiny, base, small, medium, large]
                        (default=medium)
```

**Examples:**

- **Record for 5 seconds:**

  ```bash
  vspell --duration 5
  ```
- **Use a different model for higher accuracy (e.g., `large`):**

  ```bash
  vspell --model large
  ```
- **Playback the last recording at 1.5x volume:**

  ```bash
  vspell --playback 1.5
  ```
- **Transcribe text with punctuation and original casing:**

  ```bash
  vspell --punctuate
  ```

## How It Works

1. **Record**: When you run `vspell`, it records audio from your default microphone for a set duration (default is 2 seconds) into a temporary `.wav` file.
2. **Analyze**: It checks the audio's amplitude against the calibrated silence threshold. If it's below the threshold, the program exits.
3. **Transcribe**: If speech is detected, the audio is passed to the `faster-whisper` model for transcription. The first time you use a model, it will be downloaded and cached locally in `~/.cache/huggingface/hub`.
4. **Copy**: The resulting text is copied to your system's clipboard.

## Configuration

VSpell creates a configuration directory at `~/.config/vspell`.

- `~/.config/vspell/vspell_config.json`: Stores the `silence_threshold` determined during calibration.
- `~/.config/vspell/input.wav`: The temporary audio file of your last recording.
