Metadata-Version: 2.3
Name: helpfulgremlin
Version: 0.1.3
Summary: Add your description here
Author-email: udit.ramawat@gmail.com
License: MIT License
         
         Copyright (c) 2026 Udit Ramawat
         
         Permission is hereby granted, free of charge, to any person obtaining a copy
         of this software and associated documentation files (the "Software"), to deal
         in the Software without restriction, including without limitation the rights
         to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
         copies of the Software, and to permit persons to whom the Software is
         furnished to do so, subject to the following conditions:
         
         The above copyright notice and this permission notice shall be included in all
         copies or substantial portions of the Software.
         
         THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
         IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
         FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
         AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
         LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
         OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
         SOFTWARE.
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Requires-Dist: pathspec>=1.0.4
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: rich>=14.3.1
Requires-Dist: typer>=0.21.1
Requires-Python: >=3.13
Description-Content-Type: text/markdown

# 👾 helpfulGremlin

**Sanity check your vibes before you git push.**

![Build Status](https://github.com/uramawat/helpfulGremlin/actions/workflows/release.yml/badge.svg)

I built `helpfulGremlin` because I wanted a lightweight, zero-config CLI utility to scan my codebase for sensitive artifacts—API keys, secrets, tokens, and private keys—before they are accidentally exposed. It's designed for "vibe-coding" where velocity is high, acting as a friendly guardrail.

Recently, I extended it to also check for **bad security practices** (like `eval()`, `pickle.load()`, or disabling SSL verification), making it more than just a secret scanner.

## 🚀 Quick Start

Run it instantly using `uv` (no installation required):

```bash
# Run in the current directory
uvx helpfulGremlin
```

Or install it globally:

```bash
uv tool install helpfulGremlin
helpfulGremlin .
```

## 🛠 Usage

```bash
# Scan the current directory
helpfulGremlin

# Scan a specific directory or file
helpfulGremlin ./src/my_script.py

# Verbose mode (see every file checked)
helpfulGremlin . --verbose

# Run with multiple worker processes (for large repos)
helpfulGremlin . --workers 4
```

## 🏗 Architecture & Design Decisions

### 1. **Python & `uv` First**
I chose **Python** for its rich ecosystem of text processing and regex libraries. Typically, Python tools are hard to distribute, but with **`uv`**, `helpfulGremlin` can be run ephemerally (`uvx`) without messing up your system python.

### 2. **Hybrid Detection Engine**
I implemented a two-layer detection strategy:
- **Layer 1: Regex Signatures**: Fast pattern matching for known secrets (AWS, OpenAI, Stripe, etc.). Patterns are externalized in `src/helpfulgremlin/patterns.yaml`.
- **Layer 2: Entropy Analysis**: Uses Shannon Entropy to detect high-randomness strings (like passwords or unknown API keys) that don't match specific regexes. This catches weird custom secrets others miss.

### 3. **Smart Context Awareness**
I designed the scanner to be intelligent about *where* it looks:
- **Context-Aware Scanning**: Security checks are scoped to file types (e.g., Python-specific checks like `eval()` only run on `.py` files). This keeps performance high.
- **Gitignore Support**: Automatically parses your `.gitignore` to avoid scanning `node_modules`, `venv`, etc.
- **Binary Skipping**: Detects and skips binary files to save CPU.
- **Large File Protection**: Skipping files > 5MB to prevent memory exhaustion.
- **Remediation**: It doesn't just say "Error"; it suggests *how* to fix it (e.g., "Move this hardcoded key to an environment variable").

### 4. **Modern UX (`textual` / `rich`)**
I used the `rich` library to provide beautiful, emoji-enriched terminal output, progress bars, and tables. Security tools shouldn't be boring 1990s textual walls.

## 🕵️ Detected Patterns

`helpfulGremlin` currently detects:

- **Cloud Providers**: AWS (Access/Secret Keys), Google Cloud API Keys, Azure Storage Keys (opt-in).
- **AI/ML**: OpenAI, Anthropic, Gemini, HuggingFace, Replicate.
- **Services**: Stripe, Slack, Twilio, Salesforce, Facebook.
- **Security Best Practices**: 
    - Unsafe Check: `eval()`, `exec()`
    - Unsafe Deserialization: `pickle.load()`
    - Insecure SSL: `verify=False`
    - Weak Hashing: `MD5`
    - Insecure Network Binding: `0.0.0.0`
- **Generic**: PEM Private Keys, Generic "api_key" variable assignments.
- **Unknowns**: High-entropy strings (> 4.2 bits of randomness).

## ⚙️ Configuration

You can customize the detection rules by editing the `patterns.yaml` file inside the package.

## 📦 License

MIT
