Metadata-Version: 2.4
Name: patterndb_yaml
Version: 0.2.0
Summary: YAML-based pattern matching with multi-line capabilities for log normalization using syslog-ng patterndb
Project-URL: Homepage, https://github.com/JeffreyUrban/patterndb-yaml
Project-URL: Repository, https://github.com/JeffreyUrban/patterndb-yaml
Project-URL: Issues, https://github.com/JeffreyUrban/patterndb-yaml/issues
Author: Jeffrey Urban
License: MIT License
        
        Copyright (c) 2025 Jeffrey Urban
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: terminal
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: System :: Logging
Classifier: Topic :: Terminals
Classifier: Topic :: Text Processing :: Filters
Classifier: Topic :: Utilities
Requires-Python: >=3.9
Requires-Dist: packaging>=23.0
Requires-Dist: pyyaml>=6.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typer>=0.9.0
Provides-Extra: dev
Requires-Dist: mypy>=1.8.0; extra == 'dev'
Requires-Dist: pre-commit>=3.6.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
Requires-Dist: pytest>=7.4.0; extra == 'dev'
Requires-Dist: ruff>=0.1.9; extra == 'dev'
Requires-Dist: types-pyyaml>=6.0.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.5.0; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.25.0; extra == 'docs'
Requires-Dist: pymdown-extensions>=10.0.0; extra == 'docs'
Requires-Dist: sybil>=6.0.0; extra == 'docs'
Requires-Dist: termynal>=0.12.0; extra == 'docs'
Description-Content-Type: text/markdown

# patterndb-yaml

**YAML-based pattern matching with multi-line capabilities for log normalization using syslog-ng patterndb**

[![PyPI version](https://img.shields.io/pypi/v/patterndb-yaml.svg)](https://pypi.org/project/patterndb-yaml/)
[![Tests](https://github.com/JeffreyUrban/patterndb-yaml/actions/workflows/test.yml/badge.svg)](https://github.com/JeffreyUrban/patterndb-yaml/actions/workflows/test.yml)
[![codecov](https://codecov.io/gh/JeffreyUrban/patterndb-yaml/branch/main/graph/badge.svg)](https://codecov.io/gh/JeffreyUrban/patterndb-yaml)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![Documentation](https://img.shields.io/readthedocs/patterndb-yaml)](https://patterndb-yaml.readthedocs.io/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## What is patterndb-yaml?

`patterndb-yaml` brings intuitive YAML pattern definitions to [syslog-ng's proven patterndb engine](https://www.syslog-ng.com/technical-documents/doc/syslog-ng-open-source-edition/3.38/administration-guide/56#TOPIC-1829130). Instead of writing complex XML patterns, you define rules in readable YAML and let patterndb-yaml handle the translation to syslog-ng's efficient pattern matcher.

This makes it easier to normalize heterogeneous logs - transforming different log formats into standardized output for comparison, analysis, or filtering.

## Features

- **YAML rules** - Readable pattern definitions instead of syslog-ng XML
- **Field extraction** - Pull specific data (table names, IDs, etc.) from matched lines
- **Pattern matching** - Powered by syslog-ng's efficient C implementation
- **Multi-line sequences** - Handle log entries spanning multiple lines
- **Explain mode** - Debug which patterns matched and why
- **CLI and Python API** - Use as a command-line tool or library

## Installation

**Requirements:** Python 3.9+, syslog-ng 4.10.1+

> **⚠️ Important:** `patterndb-yaml` requires syslog-ng to be installed from **official repositories** (distro defaults may be incompatible).
>
> See **[SYSLOG_NG_INSTALLATION.md](docs/SYSLOG_NG_INSTALLATION.md)** for platform-specific instructions.

### Via Homebrew (macOS + Linux) - Recommended

```bash
brew tap JeffreyUrban/patterndb-yaml && brew install patterndb-yaml
```

**✅ Automatically installs syslog-ng** as a dependency. Homebrew manages all dependencies and provides easy updates via `brew upgrade`.

### Via pipx (Alternative)

> **⚠️ Manual Setup Required:** You must install syslog-ng separately before using pipx.

```bash
# STEP 1: Install syslog-ng from official repos (REQUIRED)
# See docs/SYSLOG_NG_INSTALLATION.md for your platform

# STEP 2: Install patterndb-yaml
pipx install patterndb-yaml
```

[pipx](https://pipx.pypa.io/) installs in an isolated environment with global CLI access. Update with `pipx upgrade patterndb-yaml`.

### Via pip

> **⚠️ Manual Setup Required:** You must install syslog-ng separately before using pip.

```bash
# STEP 1: Install syslog-ng from official repos (REQUIRED)
# See docs/SYSLOG_NG_INSTALLATION.md for your platform

# STEP 2: Install patterndb-yaml
pip install patterndb-yaml
```

Use `pip` if you want to use patterndb-yaml as a library in your Python projects.

### From Source

```bash
# Development installation
git clone https://github.com/JeffreyUrban/patterndb-yaml
cd patterndb-yaml
pip install -e ".[dev]"
```

### Windows

Windows is not currently supported. Consider using WSL2 (Windows Subsystem for Linux) and following the Linux installation instructions.

**Requirements:** Python 3.9+, syslog-ng (installed automatically with Homebrew)

## Quick Start

### Command Line

Create a YAML rules file (`rules.yaml`):

```yaml
rules:
  - name: log_info
    pattern:
      - text: "["
      - text: "INFO"
      - text: "] "
      - field: message
    output: "[info:{message}]"

  - name: log_error
    pattern:
      - text: "["
      - text: "ERROR"
      - text: "] "
      - field: message
    output: "[error:{message}]"
```

Process your logs:

```bash
# Process from stdin
cat app.log | patterndb-yaml --rules rules.yaml

# Process a file
patterndb-yaml --rules rules.yaml --input app.log

# Get statistics
patterndb-yaml --rules rules.yaml --input app.log --stats
```

### Python API

```python
from patterndb_yaml import PatterndbYaml
from pathlib import Path

# Initialize with rules
processor = PatterndbYaml(rules_path=Path("rules.yaml"))

# Process logs
with open("app.log") as infile, open("clean.log", "w") as outfile:
    processor.process(infile, outfile)

# Get statistics
stats = processor.get_stats()
print(f"Matched {stats['lines_matched']} of {stats['lines_processed']} lines")
print(f"Match rate: {stats['match_rate']:.1%}")
```

## Use Cases

- **Log Normalization** - Transform heterogeneous log formats into standardized output
- **Data Extraction** - Pull structured data from unstructured log lines
- **Log Filtering** - Identify and process specific log patterns
- **Format Standardization** - Convert legacy log formats to modern structured formats
- **Compliance** - Normalize logs for security analysis and auditing

## How It Works

`patterndb-yaml` uses syslog-ng's patterndb engine for efficient pattern matching:

1. **YAML → XML** - Converts your readable YAML rules into syslog-ng's XML patterndb format
2. **Pattern Matching** - Uses syslog-ng's C implementation for fast, memory-efficient matching
3. **Field Extraction** - Pulls named fields from matched patterns
4. **Output Transformation** - Applies output templates to normalize log format

The system processes logs line-by-line with constant memory usage, making it suitable for large files and streaming data.

## Documentation

**[Read the full documentation at patterndb-yaml.readthedocs.io](https://patterndb-yaml.readthedocs.io/)**

Key sections:
- **Getting Started** - Installation and quick start guide
- **Use Cases** - Real-world examples across different domains
- **Guides** - Pattern design, performance tips, common patterns
- **Reference** - Complete CLI and Python API documentation

## Development

```bash
# Clone repository
git clone https://github.com/JeffreyUrban/patterndb-yaml.git
cd patterndb-yaml

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=patterndb_yaml --cov-report=html

# Build documentation
cd docs && mkdocs build
```

## Performance

- **Time complexity:** O(n) where n is number of log lines
- **Space complexity:** O(1) constant memory for processing
- **Throughput:** Processes logs line-by-line with streaming support
- **Memory:** Minimal memory footprint, suitable for large files

Performance is determined by syslog-ng's patterndb engine, which uses efficient C implementations for pattern matching.

## License

MIT License - See [LICENSE](LICENSE) file for details

## Author

Jeffrey Urban

---

**[Star on GitHub](https://github.com/JeffreyUrban/patterndb-yaml)** | **[Report Issues](https://github.com/JeffreyUrban/patterndb-yaml/issues)** | **[Documentation](https://patterndb-yaml.readthedocs.io/)**
