Metadata-Version: 2.4
Name: logsentinelai
Version: 0.1.0
Summary: AI-Powered Log Analyzer - Leverages LLM to analyze log files and detect security events
Project-URL: Homepage, https://github.com/call518/LogSentinelAI
Project-URL: Repository, https://github.com/call518/LogSentinelAI.git
Project-URL: Issues, https://github.com/call518/LogSentinelAI/issues
Project-URL: Documentation, https://github.com/call518/LogSentinelAI#readme
Author-email: JungJungIn <call518@gmail.com>
License: MIT
License-File: LICENSE
Keywords: ai,cybersecurity,elasticsearch,llm,log-analysis,security,threat-detection
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Logging
Classifier: Topic :: System :: Monitoring
Requires-Python: >=3.11
Requires-Dist: aiohttp>=3.12.14
Requires-Dist: anyio>=4.9.0
Requires-Dist: distro>=1.9.0
Requires-Dist: elastic-transport>=8.17.1
Requires-Dist: elasticsearch>=9.0.2
Requires-Dist: geoip2>=5.1.0
Requires-Dist: httpx>=0.28.1
Requires-Dist: interegular>=0.3.3
Requires-Dist: iso3166>=2.1.1
Requires-Dist: jinja2>=3.1.6
Requires-Dist: jiter>=0.10.0
Requires-Dist: jsonpath-ng>=1.7.0
Requires-Dist: maxminddb>=2.7.0
Requires-Dist: numpy>=2.3.2
Requires-Dist: ollama>=0.5.1
Requires-Dist: openai>=1.97.1
Requires-Dist: outlines-core>=0.1.26
Requires-Dist: outlines>=1.1.1
Requires-Dist: paramiko>=3.5.1
Requires-Dist: pydantic-core>=2.33.2
Requires-Dist: pydantic>=2.11.7
Requires-Dist: python-dateutil>=2.9.0.post0
Requires-Dist: python-dotenv>=1.1.1
Requires-Dist: requests>=2.32.4
Requires-Dist: sniffio>=1.3.1
Requires-Dist: tqdm>=4.67.1
Requires-Dist: typing-extensions>=4.14.1
Provides-Extra: dev
Requires-Dist: black>=23.0; extra == 'dev'
Requires-Dist: flake8>=6.0; extra == 'dev'
Requires-Dist: isort>=5.12; extra == 'dev'
Requires-Dist: mypy>=1.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Description-Content-Type: text/markdown

# LogSentinelAI - AI-Powered Log Analyzer

LogSentinelAI is a modern Python package that leverages LLM (Large Language Model) to analyze various log files and detect security events. It automatically analyzes Apache HTTP logs, Linux system logs, and other log types to identify security threats and stores them as structured data in Elasticsearch for visualization and analysis.

## 🚀 Key Features

### 🧠 AI-Powered Log Analysis
- **Multi-provider LLM support**: OpenAI API, local Ollama, or GPU-accelerated vLLM
- **Comprehensive log types**: HTTP access, Apache error, Linux system, and TCPDump packet analysis
- **Intelligent threat detection**: SQL injection, XSS, brute force, system intrusions, and network anomalies
- **Structured validation**: Pydantic schemas ensure consistent, reliable analysis results

### 🔄 Flexible Processing Modes
- **Batch analysis**: Complete historical log file processing for forensics and compliance
- **Real-time monitoring**: Live log analysis with intelligent sampling for high-volume environments
- **Dual access methods**: Local file processing or secure SSH remote monitoring

### 🏗️ Enhanced Data Enrichment
- **GeoIP integration**: Automatic IP geolocation with MaxMind GeoLite2 database
- **Statistics calculation**: Complete IP counts, response codes, and security metrics
- **Multi-language output**: Analysis results in English or Korean

### 🏗️ Enterprise Integration
- **Elasticsearch/Kibana**: Automatic indexing, dashboards, and visualization
- **Docker deployment**: Consistent, scalable infrastructure with ILM policies
- **Unified CLI**: Simplified command-line interface with SSH remote access support

### 🏗️ GeoIP Enrichment
- **Automatic IP geolocation**: Enriches source IPs with country information using MaxMind GeoLite2 database
- **Intelligent IP handling**: Automatically detects and handles private IPs, invalid IPs, and lookup failures
- **Performance optimized**: Built-in LRU cache for repeated IP lookups with configurable cache size
- **Non-blocking processing**: GeoIP enrichment happens after LLM analysis, ensuring zero impact on analysis performance

### 🏗️ Enterprise-Ready Architecture
- **Elasticsearch integration** with automatic indexing, ILM policies, and data lifecycle management
- **Kibana dashboards** for visualization, alerting, and security analytics
- **Docker-based deployment** for consistent, scalable infrastructure





## System Architecture

```
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Log Sources   │───>│ LogSentinelAI   │───>│ Elasticsearch   │
│                 │    │   Analysis      │    │                 │
│ • Local Files   │    │                 │    │ • Security      │
│ • Remote SSH    │    │ • LLM Analysis  │    │   Events        │
│ • HTTP Access   │    │ • Outlines      │    │ • Raw Logs      │
│ • Apache Error  │    │ • Pydantic      │    │ • Metadata      │
│ • System Logs   │    │   Validation    │    │                 │
│ • TCPDump       │    │ • Multi-format  │    │                 │
│   (Auto-detect) │    │   Support       │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                                        │
                                                        ▼
                                              ┌─────────────────┐
                                              │     Kibana      │
                                              │   Dashboard     │
                                              │                 │
                                              │ • Visualization │
                                              │ • Alerts        │
                                              │ • Analytics     │
                                              └─────────────────┘
```

## � Dashboard Example

![Kibana Dashboard](img/ex-dashboard.png)

## 📋 JSON Output Example

![JSON Output](img/ex-json.png)

## �🚀 Quick Start: Installation & Setup

### 📦 Package Installation (Recommended)

#### Option 1: Install from PyPI (Coming Soon)
```bash
# When published to PyPI
pip install logsentinelai
```

#### Option 2: Install with uv (Modern Python Package Manager)
```bash
# Install uv first (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install the package
git clone https://github.com/call518/LogSentinelAI.git
cd LogSentinelAI
uv pip install --editable .

# Alternative: Install from source
uv pip install git+https://github.com/call518/LogSentinelAI.git
```

#### Option 3: Traditional pip Installation
```bash
# Clone repository and install
git clone https://github.com/call518/LogSentinelAI.git
cd LogSentinelAI
pip install --editable .
```

### ⚙️ Configuration Setup

```bash
# 1. Setup configuration file
cp config.template config
# Edit config file and set OPENAI_API_KEY value (obtained from OpenAI account)

# 2. GeoIP Database Setup (Optional but Recommended)
logsentinelai-geoip-download
# This will download GeoLite2-Country.mmdb to current directory

# 3. Enable GeoIP in config file:
#   GEOIP_ENABLED=true  (default)
#   GEOIP_DATABASE_PATH=./GeoLite2-Country.mmdb  (default)
```

### 🚀 Quick Usage Examples

```bash
# View available commands
logsentinelai --help

# HTTP Access Log Analysis
logsentinelai-httpd-access --log-path /var/log/apache2/access.log

# Real-time monitoring  
logsentinelai-linux-system --mode realtime

# Remote SSH analysis
logsentinelai-tcpdump --remote --ssh admin@server.com --ssh-key ~/.ssh/id_rsa

# Download GeoIP database
logsentinelai-geoip-download
```

### 🚀 Elasticsearch & Kibana Setup (Optional)

For advanced visualization and analytics, you can set up Elasticsearch and Kibana:

> [!IMPORTANT]
> [Platinum features](https://www.elastic.co/subscriptions) are enabled by default for a [trial](https://www.elastic.co/docs/deploy-manage/license/manage-your-license-in-self-managed-cluster) duration of 30 days. After this evaluation period, you will retain access to all the free features included in the Open Basic license seamlessly, without manual intervention required, and without losing any data. Refer to the [How to disable paid features](https://github.com/deviantony/docker-elk#how-to-disable-paid-features) section to opt out of this behaviour.

```bash
# 1. Clone ELK stack repository and navigate to directory
# (Origin Repo) https://github.com/deviantony/docker-elk
git clone https://github.com/call518/Docker-ELK.git
cd Docker-ELK

# 2. Initialize and run ELK stack
# One-time initialization
docker compose up setup
# Generate Kibana encryption keys (recommended)
docker compose up kibana-genkeys
# Copy generated keys to kibana/config/kibana.yml
# Start ELK stack
docker compose up -d

# 3. Access Kibana: http://localhost:5601
# Default credentials: elastic / changeme
```

### 📊 Elasticsearch Index/Policy Setup

If using Elasticsearch integration:

```bash
# 1. Create ILM policy (7-day retention, 10GB/1-day rollover)
curl -X PUT "localhost:9200/_ilm/policy/logsentinelai-analysis-policy" \
-H "Content-Type: application/json" \
-u elastic:changeme \
-d '{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "10gb",
            "max_age": "1d"
          }
        }
      },
      "delete": {
        "min_age": "7d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}'

# 2. Create index template
curl -X PUT "localhost:9200/_index_template/logsentinelai-analysis-template" \
-H "Content-Type: application/json" \
-u elastic:changeme \
-d '{
  "index_patterns": ["logsentinelai-analysis-*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1,
      "index.lifecycle.name": "logsentinelai-analysis-policy",
      "index.lifecycle.rollover_alias": "logsentinelai-analysis"
    },
    "mappings": {
      "properties": {
        "@log_raw_data": {
          "type": "object",
          "dynamic": false
        }
      }
    }
  }
}'

# 3. Create initial index and write alias
curl -X PUT "localhost:9200/logsentinelai-analysis-000001" \
-H "Content-Type: application/json" \
-u elastic:changeme \
-d '{
  "aliases": {
    "logsentinelai-analysis": {
      "is_write_index": true
    }
  }
}'
```

### 📈 Advanced Usage Examples

#### Universal Command-Line Interface
All analysis commands support the same simplified command-line arguments:

```bash
# View available options for any command
logsentinelai-httpd-access --help
logsentinelai-linux-system --help
logsentinelai-tcpdump --help

# Core options: --mode, --chunk-size, --log-path, --remote, --ssh, --ssh-key
```

#### Local File Analysis (Default Mode)
```bash
# Batch analysis with default config settings
logsentinelai-linux-system

# Override log path and chunk size
logsentinelai-linux-system --log-path /var/log/messages --chunk-size 15

# Real-time monitoring
logsentinelai-linux-system --mode realtime
logsentinelai-httpd-access --mode realtime --processing-mode sampling
```

#### SSH Remote Access (Simplified Syntax)
```bash
# SSH key authentication (recommended)
logsentinelai-linux-system \
  --remote \
  --ssh admin@192.168.1.100 \
  --ssh-key ~/.ssh/id_rsa \
  --log-path /var/log/messages

# SSH with custom port
logsentinelai-httpd-access \
  --remote \
  --ssh webuser@web.company.com:8022 \
  --ssh-key ~/.ssh/web_key \
  --log-path /var/log/apache2/access.log
```

#### Multi-Server Monitoring
```bash
# Terminal 1: Web server logs
logsentinelai-httpd-access --remote --ssh web@web1.com --ssh-key ~/.ssh/web1 --log-path /var/log/apache2/access.log

# Terminal 2: Database server logs  
logsentinelai-linux-system --remote --ssh db@db1.com --ssh-key ~/.ssh/db1 --log-path /var/log/messages
```

**CLI Options Override Config Settings:**
- `--chunk-size`: Overrides `CHUNK_SIZE_*` settings in config file
- `--log-path`: Overrides `LOG_PATH_*` settings in config file  
- `--processing-mode`: Overrides `REALTIME_PROCESSING_MODE` setting
- `--sampling-threshold`: Overrides `REALTIME_SAMPLING_THRESHOLD` setting

### 📊 Import Kibana Dashboard/Settings

If using Kibana visualization:

```bash
# 1. Access Kibana: http://localhost:5601
# 2. Login: elastic / changeme
# 3. Stack Management > Saved Objects > Import
#    - Kibana-9.0.3-Advanced-Settings.ndjson (first)
#    - Kibana-9.0.3-Dashboard-LogSentinelAI.ndjson (second)
# 4. Check results at Analytics > Dashboard > LogSentinelAI Dashboard
```

---
## 🔄 Change LLM Provider/Advanced Options (Optional)

To change from OpenAI API to Ollama (local), vLLM (local/GPU), etc., please refer to the guide below.

### LLM Provider & Model Configuration (`config` file modification)

LogSentinelAI centrally manages LLM Provider and model in the `config` file.

#### OpenAI API Configuration (Default)
```bash
# Configure in config file
LLM_PROVIDER=openai
LLM_MODEL_OPENAI=gpt-4o-mini

# API key configuration required
OPENAI_API_KEY=your_openai_api_key_here
```

#### Ollama (Local LLM) Configuration
```bash
# 1. Install Ollama and download model
ollama pull qwen2.5-coder:3b
ollama serve
```

```bash
# Change configuration in config file
LLM_PROVIDER=ollama
LLM_MODEL_OLLAMA=qwen2.5-coder:3b
```

#### vLLM (Local GPU) Configuration
```bash
# Option A: Clone and use vLLM-Tutorial (recommended)
git clone https://github.com/call518/vLLM-Tutorial.git
cd vLLM-Tutorial

# Install Hugging Face CLI for model download
pip install huggingface_hub

# Download model (Default)
huggingface-cli download lmstudio-community/Qwen2.5-3B-Instruct-GGUF Qwen2.5-3B-Instruct-Q4_K_M.gguf --local-dir ./models/Qwen2.5-3B-Instruct/
huggingface-cli download Qwen/Qwen2.5-3B-Instruct generation_config.json --local-dir ./config/Qwen2.5-3B-Instruct

# Download model (Optional)
huggingface-cli download lmstudio-community/Qwen2.5-1.5B-Instruct-GGUF Qwen2.5-1.5B-Instruct-Q4_K_M.gguf --local-dir ./models/Qwen2.5-1.5B-Instruct/
huggingface-cli download Qwen/Qwen2.5-1.5B-Instruct generation_config.json --local-dir ./config/Qwen2.5-1.5B-Instruct

# It is recommended to set the temperature to 0.1 and top_p to 0.5.
cat config/Qwen2.5-3B-Instruct/generation_config.json
{
  "bos_token_id": 151643,
  "pad_token_id": 151643,
  "do_sample": true,
  "eos_token_id": [
    151645,
    151643
  ],
  "repetition_penalty": 1.05,
  "temperature": 0.1,
  "top_p": 0.5,
  "top_k": 20,
  "transformers_version": "4.37.0"
}

# Run vLLM with Docker
./run-docker-vllm---Qwen2.5-3B-Instruct.sh

# Verify API is working
curl -s -X GET http://localhost:5000/v1/models | jq

# Option B: Simple vLLM setup (without Docker)
pip install vllm
python -m vllm.entrypoints.openai.api_server --model qwen2.5-coder:3b
```

```bash
# Change configuration in config file
LLM_PROVIDER=vllm
LLM_MODEL_VLLM=Qwen/Qwen2.5-1.5B-Instruct
```

### Additional Configuration Options (`config` file)

#### Response Language Configuration
```bash
# Configure analysis result language
RESPONSE_LANGUAGE=english   # English
# RESPONSE_LANGUAGE=korean  # Korean (default)
```

#### Analysis Mode Configuration
```bash
# Configure analysis mode
ANALYSIS_MODE=batch         # Batch mode - analyze complete files (default)
# ANALYSIS_MODE=realtime    # Real-time mode - monitor live logs
```

#### Log File Path and Chunk Size Configuration
```bash
# Batch mode log file paths
LOG_PATH_HTTPD_ACCESS=sample-logs/access-10k.log      # 10k entries (default)
LOG_PATH_HTTPD_APACHE_ERROR=sample-logs/apache-10k.log
LOG_PATH_LINUX_SYSTEM=sample-logs/linux-2k.log
LOG_PATH_TCPDUMP_PACKET=sample-logs/tcpdump-packet-2k.log

# Real-time mode log file paths (live logs)
LOG_PATH_REALTIME_HTTPD_ACCESS=/var/log/apache2/access.log
LOG_PATH_REALTIME_HTTPD_APACHE_ERROR=/var/log/apache2/error.log
LOG_PATH_REALTIME_LINUX_SYSTEM=/var/log/messages
LOG_PATH_REALTIME_TCPDUMP_PACKET=/var/log/tcpdump.log

# Configure chunk sizes (number of log entries to process at once)
CHUNK_SIZE_HTTPD_ACCESS=10        # HTTP access logs
CHUNK_SIZE_HTTPD_APACHE_ERROR=10  # Apache error logs
CHUNK_SIZE_LINUX_SYSTEM=10       # Linux system logs
CHUNK_SIZE_TCPDUMP_PACKET=5       # Network packets (smaller chunks recommended)
```

#### Real-time Monitoring Configuration
```bash
# Polling interval for checking new log entries (seconds)
REALTIME_POLLING_INTERVAL=5

# Maximum number of new lines to process at once
REALTIME_MAX_LINES_PER_BATCH=50

# Position file directory for tracking file read positions
REALTIME_POSITION_FILE_DIR=.positions

# Buffer time to wait for complete log lines (seconds)
REALTIME_BUFFER_TIME=2

# Processing mode for real-time monitoring
REALTIME_PROCESSING_MODE=full     # full, sampling, or auto-sampling

# Sampling threshold for auto-sampling mode (number of lines)
REALTIME_SAMPLING_THRESHOLD=100   # When exceeded, triggers sampling in 'full' mode
```

### Verify Configuration Changes
```bash
# Run analysis after configuration changes to verify operation
logsentinelai-httpd-access
```

---
## 🌍 GeoIP Enrichment

LogSentinelAI automatically enriches IP addresses in analysis results with country information using MaxMind GeoLite2 database.

### 🚀 Quick Setup

```bash
# 1. Download GeoIP database
python download_geoip_database.py

# 2. Enable in configuration
# In config file:
GEOIP_ENABLED=true
GEOIP_DATABASE_PATH=./GeoLite2-Country.mmdb

# 3. Test functionality
python test_geoip.py
```

### 📊 Feature Overview

- **Country identification**: Automatically appends country information to IP addresses
- **Text-based format**: Uses format like "192.168.1.1 (US - United States)" for Elasticsearch compatibility
- **Private IP handling**: Marks internal IPs as "(Private)" without database lookup
- **Statistics enrichment**: Enhances IP counts and frequency data with geographic context

### ⚙️ Configuration Options

```bash
# Enable/disable GeoIP enrichment
GEOIP_ENABLED=true

# Path to MaxMind database file
GEOIP_DATABASE_PATH=./GeoLite2-Country.mmdb

# Fallback country for unknown IPs
GEOIP_FALLBACK_COUNTRY=Unknown

# Include private IPs in GeoIP processing
GEOIP_INCLUDE_PRIVATE_IPS=false

# Cache size for IP lookups (performance optimization)
GEOIP_CACHE_SIZE=1000
```

### 🔧 Manual Database Download

If automatic download fails, manually download from MaxMind:

1. Visit: https://dev.maxmind.com/geoip/geolite2-free-geolocation-data
2. Download "GeoLite2 Country" in MaxMind DB format (.mmdb)
3. Extract and place as `GeoLite2-Country.mmdb` in project directory
4. Update `GEOIP_DATABASE_PATH` in config if using different location

### 📈 Performance Impact

- **Zero impact on LLM processing**: GeoIP enrichment happens after LLM analysis
- **Cached lookups**: Repeated IP addresses are cached for better performance
- **Graceful degradation**: Analysis continues normally if GeoIP is unavailable
- **Private IP optimization**: Private IPs are handled without database queries

---
## 🔧 Configuration Options

### Change LLM Provider

You can change the LLM provider in each analysis script:

```bash
# In config file
LLM_PROVIDER=vllm  # Choose from "ollama", "vllm", "openai"
LLM_MODEL_VLLM=Qwen/Qwen2.5-1.5B-Instruct
```

Available providers:
- **Ollama**: Local model execution with models like `qwen2.5-coder:3b`
- **vLLM**: GPU-accelerated local inference with OpenAI-compatible API
- **OpenAI**: Cloud-based API using models like `gpt-4o-mini`

### Position Tracking for Real-time Monitoring

Real-time monitoring uses position files to track reading progress:

```bash
# Position files are stored in .positions/ directory
.positions/
├── linux_system_position.txt    # Tracks position for Linux system logs
├── httpd_access_position.txt     # Tracks position for HTTP access logs
└── ...
```

- Position files are automatically created and maintained
- Delete position files to restart monitoring from beginning
- Position files prevent duplicate processing during restarts

### Log File Rotation Handling

Real-time monitoring handles log rotation gracefully:

1. **Detection**: Monitors file size and inode changes
2. **Reset**: Automatically resets to beginning of new log file
3. **Continuation**: Seamless processing without data loss
4. **Position Update**: Updates position tracking for new file

### Adjust Chunk Size

You can adjust chunk size for log processing performance:

```bash
# Method 1: Edit config file (persistent setting)
# Edit CHUNK_SIZE_* values in config file
CHUNK_SIZE_HTTPD_ACCESS=20
CHUNK_SIZE_LINUX_SYSTEM=15

# Method 2: Use CLI override (temporary setting)
python analysis-linux-system-log.py --chunk-size 5
python analysis-httpd-access-log.py --chunk-size 25
```

**Recommended values**: 5-50 depending on log complexity and LLM capacity

## 📊 Output Data Schema

### Elasticsearch Document Structure

```json
{
  "@chunk_analysis_start_utc": "2025-07-25T10:00:00Z",
  "@chunk_analysis_end_utc": "2025-07-25T10:00:05Z", 
  "@processing_result": "success",
  "@processing_mode": "realtime",
  "@access_mode": "ssh",
  "@sampling_threshold": 100,
  "@log_count": 15,
  "@timestamp": "2025-07-25T10:00:05.123Z",
  "@log_type": "httpd_access",
  "@document_id": "httpd_access_20250725_100005_123456_chunk_1",
  "@llm_provider": "vllm",
  "@llm_model": "Qwen/Qwen2.5-1.5B-Instruct",
  "@log_path": "/var/log/apache2/access.log",
  "@log_raw_data": {
    "LOGID-7DD17B008706AC22C60AD6DF9AC5E2E9": "203.0.113.45 - - [25/Jul/2025:10:00:01 +0000] \"GET /api/users?id=1' OR '1'='1 HTTP/1.1\" 403 2847",
    "LOGID-F3B6E3F03EC9E5BC1F65624EB65C6C51": "198.51.100.23 - - [25/Jul/2025:10:00:02 +0000] \"POST /api/login HTTP/1.1\" 200 1205"
  },
  "summary": "Analysis detected SQL injection attempts and suspicious authentication patterns from multiple international sources. Immediate review recommended.",
  "events": [
    {
      "event_type": "SQL_INJECTION",
      "severity": "HIGH", 
      "description": "SQL injection attack attempt detected in GET parameter from US-based IP",
      "confidence_score": 0.92,
      "source_ips": ["203.0.113.45 (US - United States)"],
      "url_pattern": "/api/users",
      "http_method": "GET",
      "response_codes": ["403"],
      "attack_patterns": ["SQL_INJECTION", "PARAMETER_MANIPULATION"],
      "recommended_actions": ["Block IP immediately", "Add WAF rule", "Review user account security"],
      "requires_human_review": true,
      "related_log_ids": ["LOGID-7DD17B008706AC22C60AD6DF9AC5E2E9"]
    },
    {
      "event_type": "SUSPICIOUS_LOGIN",
      "severity": "MEDIUM",
      "description": "Multiple authentication attempts from France-based IP within short timeframe",
      "confidence_score": 0.75,
      "source_ips": ["198.51.100.23 (FR - France)"],
      "url_pattern": "/api/login",
      "http_method": "POST", 
      "response_codes": ["200"],
      "attack_patterns": ["BRUTE_FORCE", "CREDENTIAL_STUFFING"],
      "recommended_actions": ["Monitor IP", "Enable 2FA", "Check user account activity"],
      "requires_human_review": false,
      "related_log_ids": ["LOGID-F3B6E3F03EC9E5BC1F65624EB65C6C51"]
    }
  ],
  "statistics": {
    "total_requests": 15,
    "unique_ips": 8,
    "error_rate": 0.13,
    "top_source_ips": {
      "203.0.113.45 (US - United States)": 3,
      "198.51.100.23 (FR - France)": 2,
      "192.168.1.100 (Private)": 5,
      "10.0.0.50 (Private)": 3,
      "172.16.0.25 (Private)": 2
    },
    "response_code_dist": {
      "200": 11,
      "403": 2,
      "404": 1,
      "500": 1
    },
    "event_by_type": {
      "SQL_INJECTION": 1,
      "SUSPICIOUS_LOGIN": 1,
      "INFO": 2
    }
  },
  "highest_severity": "HIGH",
  "requires_immediate_attention": true
}
```

---

## Roadmap

LogSentinelAI is continuously evolving to provide more comprehensive security analysis capabilities. Here are our planned enhancements:

### 🔮 Phase 1: Enhanced Intelligence & Automation

#### 🤖 Automated Response Action Chain
Implement intelligent automated response capabilities that trigger immediate security actions based on analysis results:

**Threat Detection & Response Flow:**
```python
# Critical security event detected → Automated response chain
async def execute_security_response(critical_event):
    # 1. Immediate IP blocking via firewall
    await firewall_block_ip(
        ip=event.source_ip, 
        duration="1h", 
        reason="LogSentinelAI: Critical threat detected"
    )
    
    # 2. Real-time team notification
    await send_alert(
        channels=["#security", "#ops"],
        severity="CRITICAL",
        event_summary=event.description,
        recommended_actions=event.recommended_actions
    )
    
    # 3. SOAR platform integration
    await trigger_playbook(
        playbook="incident_response",
        event_data=event.to_dict(),
        auto_escalate=True
    )
    
    # 4. Forensic data collection
    await collect_additional_logs(
        source_ip=event.source_ip,
        timerange="2h",
        log_types=["system", "network", "application"]
    )
```

**Planned Integrations:**
- **Firewall Management**: Automatic IP blocking/unblocking via pfSense, iptables, cloud firewalls
- **SOAR Platforms**: Phantom, Demisto, TheHive integration for automated playbook execution
- **Communication**: Slack, Teams, email, SMS alerts with severity-based routing
- **Threat Intelligence**: Real-time IOC feeds, IP reputation services, CVE database correlation
- **Log Collection**: Automated forensic log gathering from multiple sources during incidents

**Benefits:**
- ⚡ **Instant Response**: Sub-second reaction time to critical threats
- 🎯 **Precision Blocking**: Context-aware IP blocking with automatic expiration
- 📊 **Forensic Readiness**: Automatic evidence collection for incident investigation
- 🔄 **Workflow Integration**: Seamless integration with existing security operations workflows
- 📈 **Response Analytics**: Detailed metrics on response effectiveness and timing

### 🔮 Phase 2: Advanced Correlation & Intelligence

#### 🧠 Multi-Source Log Correlation
- Cross-platform log analysis combining multiple security tools
- Timeline correlation across different log sources
- Advanced pattern recognition using historical data

#### 🌐 Threat Intelligence Integration
- Real-time IOC (Indicators of Compromise) matching
- IP reputation and geolocation enrichment
- CVE database correlation for vulnerability context

#### 📊 Predictive Security Analytics
- Machine learning models for anomaly detection
- Behavioral baseline establishment and deviation alerts
- Proactive threat hunting capabilities

### 🔮 Phase 3: Enterprise-Scale Deployment

#### ☁️ Cloud-Native Architecture
- Kubernetes deployment with auto-scaling
- Multi-region distributed processing
- High-availability Elasticsearch clusters

#### 🔌 Enterprise Integrations
- SIEM platform connectors (Splunk, QRadar, ArcSight)
- Identity provider integration (Active Directory, LDAP, SAML)
- Compliance reporting for SOC 2, PCI DSS, GDPR

#### 🎛️ Advanced Management Console
- Web-based administration interface
- Role-based access control and audit logging
- Real-time monitoring and performance dashboards

---

## �🙏 Acknowledgments

We would like to express our sincere gratitude to the following projects and communities that provided inspiration, guidance, and foundational technologies for LogSentinelAI:

### 🔧 Core Technologies & Frameworks
- **[Outlines](https://dottxt-ai.github.io/outlines/latest/)** - Structured LLM output generation framework that powers our reliable AI analysis
- **[dottxt-ai Demos](https://github.com/dottxt-ai/demos/tree/main/logs)** - Excellent log analysis examples and implementation patterns
- **[Docker ELK Stack](https://github.com/deviantony/docker-elk)** - Comprehensive Elasticsearch, Logstash, and Kibana Docker setup

### 🤖 LLM Infrastructure & Deployment
- **[vLLM](https://github.com/vllm-project/vllm)** - High-performance LLM inference engine for GPU-accelerated local deployment
- **[Ollama](https://ollama.com/)** - Simplified local LLM deployment and management platform

### 🌟 Open Source Community
We are deeply grateful to the broader open source community and the countless projects that have contributed to making AI-powered log analysis accessible and practical. This project stands on the shoulders of many innovative open source initiatives that continue to push the boundaries of what's possible.