Metadata-Version: 2.4
Name: joyfuljay
Version: 0.1.0
Summary: Python library for extracting ML-ready features from encrypted network traffic
Project-URL: Homepage, https://github.com/cenab/joyfuljay
Project-URL: Documentation, https://joyfuljay.readthedocs.io
Project-URL: Repository, https://github.com/cenab/joyfuljay
Project-URL: Issues, https://github.com/cenab/joyfuljay/issues
Project-URL: Changelog, https://github.com/cenab/joyfuljay/blob/main/CHANGELOG.md
Author: JoyfulJay Contributors
License-Expression: MIT
License-File: LICENSE
Keywords: QUIC,TLS,encrypted,features,machine-learning,network,pcap,security,traffic
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Security
Classifier: Topic :: System :: Networking :: Monitoring
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: click>=8.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: scapy>=2.5.0
Provides-Extra: accelerated
Requires-Dist: cython>=3.0.0; extra == 'accelerated'
Provides-Extra: all
Requires-Dist: cython>=3.0.0; extra == 'all'
Requires-Dist: dpkt>=1.9.8; extra == 'all'
Requires-Dist: kafka-python>=2.0; extra == 'all'
Requires-Dist: msgpack>=1.0.0; extra == 'all'
Requires-Dist: networkx>=3.0; extra == 'all'
Requires-Dist: prometheus-client>=0.17.0; extra == 'all'
Requires-Dist: psycopg>=3.1; extra == 'all'
Requires-Dist: websockets>=12.0; extra == 'all'
Requires-Dist: zeroconf>=0.131.0; extra == 'all'
Provides-Extra: db
Requires-Dist: psycopg>=3.1; extra == 'db'
Provides-Extra: dev
Requires-Dist: hypothesis>=6.0; extra == 'dev'
Requires-Dist: mypy>=1.0; extra == 'dev'
Requires-Dist: pandas-stubs>=2.0; extra == 'dev'
Requires-Dist: prometheus-client>=0.17.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: types-click>=7.1; extra == 'dev'
Provides-Extra: discovery
Requires-Dist: zeroconf>=0.131.0; extra == 'discovery'
Provides-Extra: docs
Requires-Dist: mkdocs-git-revision-date-localized-plugin>=1.2; extra == 'docs'
Requires-Dist: mkdocs-glightbox>=0.3; extra == 'docs'
Requires-Dist: mkdocs-material>=9.0; extra == 'docs'
Requires-Dist: mkdocs-minify-plugin>=0.7; extra == 'docs'
Requires-Dist: mkdocs>=1.5; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.24; extra == 'docs'
Provides-Extra: dpkt
Requires-Dist: dpkt>=1.9.8; extra == 'dpkt'
Provides-Extra: fast
Requires-Dist: dpkt>=1.9.8; extra == 'fast'
Provides-Extra: graphs
Requires-Dist: networkx>=3.0; extra == 'graphs'
Provides-Extra: kafka
Requires-Dist: kafka-python>=2.0; extra == 'kafka'
Provides-Extra: libpcap
Requires-Dist: python-libpcap>=0.4.0; extra == 'libpcap'
Provides-Extra: monitoring
Requires-Dist: prometheus-client>=0.17.0; extra == 'monitoring'
Provides-Extra: postgres
Requires-Dist: psycopg>=3.1; extra == 'postgres'
Provides-Extra: remote
Requires-Dist: msgpack>=1.0.0; extra == 'remote'
Requires-Dist: websockets>=12.0; extra == 'remote'
Provides-Extra: sqlite
Description-Content-Type: text/markdown

<div align="center">

<img src="docs/assets/images/logo.png" alt="JoyfulJay Logo" width="200">

# JoyfulJay - Encrypted Traffic Feature Extraction

[![CI](https://github.com/cenab/joyfuljay/actions/workflows/ci.yml/badge.svg)](https://github.com/cenab/joyfuljay/actions/workflows/ci.yml)
[![PyPI version](https://badge.fury.io/py/joyfuljay.svg)](https://badge.fury.io/py/joyfuljay)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

![JoyfulJay](https://img.shields.io/badge/JoyfulJay-387%20Features-blue?style=flat-square)
![ML Ready](https://img.shields.io/badge/ML-Research%20Ready-22D3EE?style=flat-square)
![Encrypted Traffic](https://img.shields.io/badge/Encrypted-TLS%20%2F%20QUIC-success?style=flat-square)
![Research Tool](https://img.shields.io/badge/Use-Academic%20Research-informational?style=flat-square)

</div>

**JoyfulJay** is a Python library for extracting standardized, ML-ready features from encrypted network traffic. It operates on PCAP files and live network interfaces, producing feature vectors that capture timing, size, and protocol metadata patterns - all without decrypting any traffic.

## Features

- **Encrypted Traffic Focus**: Extract features proven effective for classifying TLS, QUIC, VPN, and Tor traffic
- **ML-Ready Output**: Pandas DataFrames, NumPy arrays, CSV, JSON, or Parquet - ready for scikit-learn, PyTorch, etc.
- **Streaming Architecture**: Process multi-GB PCAPs without loading them into memory
- **Live Capture**: Real-time feature extraction from network interfaces
- **Remote Capture**: Stream packets from remote devices over secure WebSocket (TLS/WSS)
- **Protocol Metadata**: TLS handshake parsing, JA3/JA3S fingerprints, QUIC metadata
- **Traffic Fingerprinting**: Detect Tor, VPN, and DoH traffic patterns
- **Tranalyzer Compatible**: 387 features across 21 extractors, matching research-grade tools
- **Enterprise Ready**: Kafka streaming, Prometheus metrics, mDNS discovery

## Installation

```bash
pip install joyfuljay
# or
uv pip install joyfuljay
```

For optional features (same syntax works with `uv pip`):

```bash
# Fast parsing with dpkt
pip install joyfuljay[fast]

# High-speed capture with libpcap
pip install joyfuljay[libpcap]

# Kafka streaming output
pip install joyfuljay[kafka]

# Prometheus metrics
pip install joyfuljay[monitoring]

# mDNS server discovery
pip install joyfuljay[discovery]

# Connection graph analysis
pip install joyfuljay[graphs]

# All optional features
pip install joyfuljay[fast,kafka,monitoring,discovery,graphs]
```

## Quick Start

### Python API

```python
from joyfuljay import extract_features_from_pcap

# Extract features from a PCAP file
features_df = extract_features_from_pcap("capture.pcap")

print(features_df.shape)
print(features_df.columns.tolist())
print(features_df.head())
```

### Command Line

```bash
# Extract features to CSV
jj extract capture.pcap -o features.csv

# Live capture for 60 seconds
jj live eth0 --duration 60 -o live_features.csv

# Output as JSON
jj extract capture.pcap -o features.json --format json
```

## Feature Groups

| Group | Features |
|-------|----------|
| **Flow Metadata** | 5-tuple, duration, packet/byte counts |
| **Timing** | Inter-arrival time statistics, burst metrics |
| **Size** | Packet length statistics, payload bytes |
| **TLS** | Version, cipher suite, SNI, JA3/JA3S fingerprints |
| **QUIC** | Version, ALPN, connection IDs |
| **Padding** | Fixed-size detection, constant-rate detection |
| **Fingerprint** | Tor/VPN/DoH classification |
| **TCP Analysis** | Flags, handshake, sequence/window analysis |
| **MAC/Layer 2** | Source/dest MAC, VLAN, Ethernet type |
| **ICMP** | Type/code, echo success ratio |
| **Connection Graphs** | Fan-out, communities, centrality (requires `[graphs]`) |

## Remote Capture

Stream packets from a remote device (e.g., Android phone, Raspberry Pi) to your analysis machine:

```bash
# On the capture device - start server with TLS
jj serve wlan0 --tls-cert server.crt --tls-key server.key --announce

# On your machine - discover and connect
jj discover                    # Find servers on LAN
jj connect jj://192.168.1.50:8765?token=xxx&tls=1 -o features.csv
```

## Kafka Streaming

Stream features directly to Kafka for real-time pipelines:

```python
from joyfuljay.output.kafka import KafkaWriter

with KafkaWriter("localhost:9092", topic="network-features") as writer:
    for features in extract_features_streaming("capture.pcap"):
        writer.write(features)
```

## Prometheus Metrics

Export processing metrics for monitoring:

```python
from joyfuljay.monitoring import PrometheusMetrics, start_prometheus_server

metrics = PrometheusMetrics()
start_prometheus_server(9090)  # Scrape at http://localhost:9090/metrics
```

## Requirements

- Python 3.10+
- scapy >= 2.5.0
- pandas >= 2.0.0
- numpy >= 1.24.0

## Cross-Platform Support

| Feature | Linux | macOS | Windows |
|---------|-------|-------|---------|
| PCAP file processing | ✅ | ✅ | ✅ |
| Live capture | ✅ | ✅ | ✅ (requires [Npcap](https://npcap.com/)) |

Check your system status with:
```bash
jj status
```

## Documentation

Full documentation: [https://joyfuljay.readthedocs.io](https://joyfuljay.readthedocs.io)

## Citation

If you use JoyfulJay in academic research, please cite:

```bibtex
@software{joyfuljay2025,
  title = {{JoyfulJay}: Encrypted Traffic Feature Extraction Library},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/cenab/joyfuljay}
}
```

## License

MIT License - see [LICENSE](LICENSE) for details.
