Metadata-Version: 2.4
Name: dinoscan
Version: 2.0.2
Summary: Comprehensive AST-based Python code analysis toolkit
License: MIT
Keywords: static-analysis,code-quality,security,ast,linting
Author: DinoScan Development Team
Author-email: dev@dinoair.com
Requires-Python: >=3.9,<4.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Security
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Requires-Dist: colorama (>=0.4.6,<0.5.0) ; sys_platform == "win32"
Requires-Dist: pyyaml (>=6.0,<7.0)
Project-URL: Documentation, https://github.com/DinoPitStudiosllc/DinoAir/blob/main/DinoScan/README.md
Project-URL: Homepage, https://github.com/DinoPitStudiosllc/DinoAir
Project-URL: Repository, https://github.com/DinoPitStudiosllc/DinoAir
Description-Content-Type: text/markdown

# DinoScan# DinoScan Static Code Analyzer



DinoScan is a comprehensive Python code analysis toolkit that leverages AST (Abstract Syntax Tree) parsing for precise, semantic analysis of your codebase. Built with modern Python 3.9+ features, it provides enhanced accuracy over traditional regex-based tools.A comprehensive static code analysis toolkit for Python and JavaScript/TypeScript projects.



## 🚀 Features## Tools Included



### **AST-Based Analyzers**### 1. Duplicate Code Finder (`duplicate-finder.ps1`)

- **Security Analysis**: Detect vulnerabilities using AST parsing and entropy analysis for secretsDetects duplicate and similar code patterns using multiple approaches:

- **Circular Import Detection**: Find dependency cycles using Tarjan's strongly connected components algorithm  - Function signature duplicates

- **Dead Code Detection**: Identify unused code with cross-file usage analysis and framework-aware patterns- Code block similarities

- **Documentation Quality**: Validate docstrings with multi-style support (Google, Sphinx, NumPy)- Import pattern repetitions

- **Duplicate Code Detection**: Find duplicates using winnowing algorithms and structural similarity- Naming duplicates



### **Modern Python Architecture**### 2. Circular Import Finder (`circular-import-finder.ps1`)

- **AST-based parsing** for semantic accuracy vs. simple text matchingAnalyzes Python import dependencies to detect circular imports:

- **Modern Python 3.9+** with union types, dataclasses, and pathlib- Uses Tarjan's strongly connected components algorithm

- **Comprehensive configuration** with JSON/YAML support and hierarchical configs- Supports relative and absolute imports

- **Multi-format output** supporting Console, JSON, XML, and SARIF formats- Configurable analysis depth and scope

- **Performance optimizations** including AST caching and efficient algorithms

### 3. Style & Lint Analyzer (`style-lint-analyzer.ps1`)

## 📦 InstallationComprehensive style, syntax, and lint checking for:

- **Python**: PEP 8 compliance, naming conventions, import organization

### Requirements- **JavaScript/TypeScript**: ES6+ best practices, style guidelines, TypeScript-specific rules

- Python 3.9 or later

- PyYAML (for YAML configuration support)### 4. Security Vulnerability Scanner (`security-analyzer.ps1`) 🔒

Advanced security analysis to detect critical vulnerabilities:

### Setup- **Hardcoded secrets**: API keys, passwords, tokens

```bash- **Injection flaws**: SQL injection, command injection, XSS

# Clone the repository- **Cryptographic issues**: Weak algorithms, insecure random generation

git clone <repository-url>- **Authentication problems**: Hardcoded credentials, insecure patterns

cd DinoScan- **Information disclosure**: Debug mode, exposed secrets



# Install dependencies (optional, for YAML support)### 5. Dead Code Detector (`dead-code-detector.ps1`) 🧹

pip install pyyamlIdentifies unused and unnecessary code:

- **Unused functions and classes**

# The analyzers are ready to use immediately- **Unreferenced variables and imports**

```- **Orphaned files** not referenced anywhere

- **Dead code branches** and unreachable code

## 🔧 Quick Start- **Cross-file usage analysis** for accurate detection



### Individual Analyzers## Quick Start



```bash```powershell

# Security analysis# Run duplicate code analysis

python advanced_security_analyzer.py /path/to/project.\duplicate-finder.ps1



# Circular import detection  # Check for circular imports

python circular_import_analyzer.py /path/to/project.\circular-import-finder.ps1



# Dead code detection# Analyze code style and lint issues

python dead_code_analyzer.py /path/to/project.\style-lint-analyzer.ps1



# Documentation quality analysis# Scan for security vulnerabilities

python doc_quality_analyzer.py /path/to/project.\security-analyzer.ps1



# Duplicate code detection# Find dead/unused code

python duplicate_code_analyzer.py /path/to/project.\dead-code-detector.ps1

```

# Run comprehensive analysis with custom configuration

### With Configuration.\style-lint-analyzer.ps1 -ConfigFile "lint-config.json" -AutoFix -ExportResults

.\security-analyzer.ps1 -ConfigFile "security-config.json" -ExportResults

```bash.\dead-code-detector.ps1 -Types "functions", "imports" -ExportResults

# Use custom configuration```

python advanced_security_analyzer.py /path/to/project --config config.json

## Configuration

# Output to file in JSON format

python dead_code_analyzer.py /path/to/project --output-format json --output-file results.json### Custom Configuration File

Create a `lint-config.json` file to customize analysis:

# Verbose mode with specific settings

python doc_quality_analyzer.py /path/to/project --style google --enforce-style --verbose```json

```{

  "rules": {

## ⚙️ Configuration    "python-style": { "enabled": true, "severity": "Medium" },

    "js-typescript": { "enabled": true, "severity": "High" }

### Unified Configuration File  },

DinoScan uses a single `config.json` file for all analyzers:  "settings": {

    "maxLineLength": 88,

```json    "maxComplexity": 10,

{    "allowConsoleLog": false

  "analyzers": {  }

    "security": {}

      "enabled": true,```

      "entropy_threshold": 4.5,

      "min_secret_length": 8,### Available Rule Categories

      "severity_levels": {- **python-style**: PEP 8 style guidelines

        "high": ["sql_injection", "code_injection", "path_traversal"],- **python-imports**: Import organization and best practices

        "medium": ["hardcoded_secrets", "weak_crypto"]- **python-naming**: Naming conventions (snake_case, PascalCase)

      }- **python-complexity**: Cyclomatic complexity analysis

    },- **python-docstrings**: Docstring requirements

    "dead_code": {- **js-style**: JavaScript/TypeScript style guidelines

      "enabled": true,- **js-es6**: Modern JavaScript (ES6+) best practices

      "min_lines": 6,- **js-naming**: Naming conventions (camelCase, PascalCase)

      "include_tests": false,- **js-complexity**: Code complexity analysis

      "exclude_public_api": true,- **js-typescript**: TypeScript-specific rules

      "framework_patterns": {

        "django": ["admin", "models", "views", "urls"],## Usage Examples

        "flask": ["app", "route", "blueprint"],

        "fastapi": ["router", "endpoint", "dependency"]### Security Vulnerability Scanner

      }

    },```powershell

    "documentation": {# Basic security scan

      "enabled": true,.\security-analyzer.ps1

      "preferred_style": "google",

      "require_parameter_docs": true,# Scan specific vulnerability categories

      "require_return_docs": true,.\security-analyzer.ps1 -Categories "secrets", "injection"

      "min_docstring_length": 10

    },# Filter by severity

    "duplicate_code": {.\security-analyzer.ps1 -Severity "Critical", "High"

      "enabled": true,

      "min_lines": 6,# Quick scan mode (faster, less thorough)

      "similarity_threshold": 0.8,.\security-analyzer.ps1 -QuickScan

      "detect_structural": true,

      "winnow_k": 17# Export security report

    }.\security-analyzer.ps1 -ExportResults -OutputFile "security-audit.json"

  },```

  "global": {

    "output_format": "console",### Dead Code Detector

    "use_colors": true,

    "exclude_dirs": ["__pycache__", ".git", ".pytest_cache", "node_modules"]```powershell

  }# Find all types of dead code

}.\dead-code-detector.ps1

```

# Check only specific types

## 📊 Analyzers Overview.\dead-code-detector.ps1 -Types "functions", "imports"



### 1. Advanced Security Analyzer (`advanced_security_analyzer.py`)# Aggressive detection (higher sensitivity)

**AST-based security vulnerability detection with entropy analysis**.\dead-code-detector.ps1 -Aggressive



```bash# Exclude potential public API functions

# Basic security scan.\dead-code-detector.ps1 -ExcludePublicAPI

python advanced_security_analyzer.py /path/to/project

# Include test files in analysis

# High-severity vulnerabilities only  .\dead-code-detector.ps1 -IncludeTests

python advanced_security_analyzer.py /path/to/project --min-severity high```



# Custom patterns and entropy threshold### Style & Lint Analyzer

python advanced_security_analyzer.py /path/to/project --config security-config.json --verbose

``````powershell

# Basic analysis

**Key Features:**.\style-lint-analyzer.ps1

- Entropy analysis for secret detection (API keys, tokens)

- AST-based SQL injection and code injection detection# Analyze specific directory with verbose output

- Cryptographic vulnerability analysis.\style-lint-analyzer.ps1 -Path "src" -Verbose

- Framework-aware security patterns

- Configurable severity levels and custom patterns# Run only Python rules

.\style-lint-analyzer.ps1 -Rules "python-style", "python-naming"

### 2. Circular Import Analyzer (`circular_import_analyzer.py`)  

**Graph-based circular dependency detection using Tarjan's algorithm**# Filter by severity

.\style-lint-analyzer.ps1 -Severity "High", "Medium"

```bash

# Detect circular imports# Apply automatic fixes

python circular_import_analyzer.py /path/to/project.\style-lint-analyzer.ps1 -AutoFix



# Include test files and detailed output# Include test files

python circular_import_analyzer.py /path/to/project --include-tests --verbose.\style-lint-analyzer.ps1 -IncludeTests



# Export to SARIF for CI integration# Export results to JSON

python circular_import_analyzer.py /path/to/project --output-format sarif --output-file imports.sarif.\style-lint-analyzer.ps1 -ExportResults -OutputFile "my-results.json"

``````



**Key Features:**### Circular Import Finder

- Tarjan's strongly connected components algorithm

- AST-based import parsing (handles complex import patterns)  ```powershell

- Cross-file dependency analysis# Basic analysis

- Support for relative and absolute imports.\circular-import-finder.ps1

- Framework-aware exclusions

# Include TYPE_CHECKING imports

### 3. Dead Code Analyzer (`dead_code_analyzer.py`).\circular-import-finder.ps1 -IncludeTypeChecking

**Project-wide unused code detection with cross-reference analysis**

# Add extra search paths

```bash.\circular-import-analyzer.ps1 -ExtraPath "lib", "vendor"

# Find unused code

python dead_code_analyzer.py /path/to/project# Export results

.\circular-import-finder.ps1 -ExportResults

# Aggressive mode with private methods```

python dead_code_analyzer.py /path/to/project --aggressive --include-tests

### Duplicate Code Finder

# Exclude public API functions

python dead_code_analyzer.py /path/to/project --exclude-public-api```powershell

```# Comprehensive duplicate analysis

.\duplicate-finder.ps1

**Key Features:**

- Cross-file usage tracking for accurate detection# The script automatically excludes common patterns and focuses on actual duplicates

- Framework-aware patterns (Django, Flask, FastAPI, etc.)```

- Public API preservation with `__all__` support

- Symbol type analysis (functions, classes, variables, imports)## Features

- Entry point detection and smart exclusions

### Auto-Fix Capabilities

### 4. Documentation Quality Analyzer (`doc_quality_analyzer.py`)The style analyzer can automatically fix many common issues:

**Multi-style docstring validation with type consistency checking**- Trailing whitespace

- Missing semicolons (JavaScript)

```bash- Quote style consistency

# Check documentation quality- Space after commas

python doc_quality_analyzer.py /path/to/project- Variable declaration improvements (`var` → `let`/`const`)



# Enforce Google-style docstrings### Intelligent Exclusions

python doc_quality_analyzer.py /path/to/project --style google --enforce-style- Automatically excludes build artifacts, dependencies, and generated files

- Respects `.gitignore` patterns (when configured)

# Validate examples and type consistency  - Excludes test files by default (configurable)

python doc_quality_analyzer.py /path/to/project --no-private --verbose- Skips legitimate polymorphic patterns in duplicate detection

```

### Comprehensive Reporting

**Key Features:**- Detailed issue descriptions with line/column information

- Multi-style support (Google, Sphinx, NumPy, plain)- Severity levels (High, Medium, Low)

- Parameter documentation completeness checking- Fixable issue identification

- Type annotation consistency validation- Export to JSON for integration with other tools

- Code example syntax validation- Summary statistics and recommendations

- Configurable requirements for different symbol types

## Integration

### 5. Duplicate Code Analyzer (`duplicate_code_analyzer.py`)

**Winnowing-based similarity detection with structural analysis**### CI/CD Integration

Exit codes indicate analysis results:

```bash- `0`: No issues or only low-severity issues

# Find duplicate code- `1`: High-severity issues found, circular imports detected, or critical security vulnerabilities

python duplicate_code_analyzer.py /path/to/project

### Security-First Workflow

# Adjust sensitivity and detection types```powershell

python duplicate_code_analyzer.py /path/to/project --threshold 0.9 --no-structural# Security-first analysis pipeline

.\security-analyzer.ps1 -Severity "Critical", "High"

# Minimum size and enable partial matchingif ($LASTEXITCODE -eq 1) {

python duplicate_code_analyzer.py /path/to/project --min-lines 10 --enable-partial    Write-Host "Critical security issues found. Build blocked." -ForegroundColor Red

```    exit 1

}

**Key Features:**

- Winnowing algorithm for efficient similarity detection.\dead-code-detector.ps1 -ExportResults

- AST-based structural comparison beyond text matching.\style-lint-analyzer.ps1 -AutoFix -Severity "High"

- Configurable similarity thresholds and block sizes```

- Exact, structural, and partial duplicate detection

- Smart filtering of framework boilerplate### Custom Workflows

```powershell

## 🔄 Output Formats# Pre-commit hook style

if (.\style-lint-analyzer.ps1 -Severity "High") {

DinoScan supports multiple output formats for different use cases:    Write-Host "High-severity issues found. Commit blocked." -ForegroundColor Red

    exit 1

### Console (Default)}

Colored, human-readable output with context and suggestions:

```# Cleanup workflow

advanced_security_analyzer.py /path/to/project.\dead-code-detector.ps1 -Types "imports" -ExportResults  # Safe cleanup first

```.\dead-code-detector.ps1 -Types "functions" -ExcludePublicAPI  # Then functions

```

### JSON  

Machine-readable format for tooling integration:## Architecture

```bash

python dead_code_analyzer.py /path/to/project --output-format json --output-file results.jsonThe DinoScan toolkit follows consistent patterns:

```- Comprehensive configuration systems

- Detailed progress reporting with verbose modes

### SARIF (Static Analysis Results Interchange Format)- Modular rule-based architecture

Industry standard for CI/CD integration:- Consistent output formatting

```bash- Extensive exclusion controls

python duplicate_code_analyzer.py /path/to/project --output-format sarif --output-file duplicates.sarif- Export capabilities for integration

```

Each tool is self-contained but follows the same architectural patterns for consistency and maintainability.

### XML

Structured format for enterprise tools:## Supported File Types

```bash

python doc_quality_analyzer.py /path/to/project --output-format xml --output-file docs.xml### Python

```- `.py` files

- Comprehensive PEP 8 compliance checking

## 🚀 CI/CD Integration- Import analysis and circular dependency detection

- Complexity metrics

### Exit Codes

- `0`: No issues or only low-severity issues found### JavaScript/TypeScript

- `1`: Medium or high-severity issues found  - `.js`, `.ts`, `.jsx`, `.tsx` files

- `2`: Critical issues found (security analyzer only)- ES6+ feature detection and recommendations

- TypeScript-specific rule enforcement

### GitHub Actions Example- Modern JavaScript best practices

```yaml

name: DinoScan Analysis### Additional Support

on: [push, pull_request]- `.vue`, `.svelte` files (basic JavaScript analysis)

- Configuration files for common tools

jobs:- Automatic detection of project structure

  analysis:

    runs-on: ubuntu-latest## Requirements

    steps:

      - uses: actions/checkout@v3- PowerShell 5.1 or later

      - uses: actions/setup-python@v4- Windows environment

        with:- No external dependencies required

          python-version: '3.9'

      ## Best Practices

      - name: Security Analysis

        run: |1. **Run regularly**: Integrate into your development workflow

          python advanced_security_analyzer.py . --output-format sarif --output-file security.sarif2. **Start with high-severity issues**: Focus on critical problems first

        3. **Use auto-fix**: Let the tool handle routine formatting issues

      - name: Upload SARIF4. **Customize configuration**: Adapt rules to your project's needs

        uses: github/codeql-action/upload-sarif@v25. **Review exclusions**: Ensure important files aren't being skipped

        with:6. **Monitor trends**: Track improvement over time with exported reports

          sarif_file: security.sarif

```## Contributing



### Pre-commit Hook ExampleEach script is self-documented with comprehensive parameter descriptions and usage examples. The modular architecture makes it easy to add new rules or extend functionality.
```bash
#!/bin/bash
# Run critical security checks
python advanced_security_analyzer.py . --min-severity high
if [ $? -ne 0 ]; then
    echo "Critical security issues found. Commit blocked."
    exit 1
fi

# Check for circular imports
python circular_import_analyzer.py .
if [ $? -ne 0 ]; then
    echo "Circular imports detected. Commit blocked."  
    exit 1
fi
```

## 🎯 Best Practices

### 1. **Security-First Workflow**
```bash
# Always run security analysis first
python advanced_security_analyzer.py /path/to/project --min-severity medium
```

### 2. **Incremental Analysis**  
```bash
# Focus on specific areas during development
python dead_code_analyzer.py /path/to/specific/module
python doc_quality_analyzer.py /path/to/new/feature --style google
```

### 3. **Configuration Management**
- Use project-specific `config.json` files
- Version control your configuration
- Start with default settings and customize gradually

### 4. **Performance Optimization**
- Exclude unnecessary directories in configuration
- Use specific analyzers rather than running all tools
- Cache results when possible (built-in AST caching)

### 5. **Integration Strategy**  
- Start with security and circular import analysis (critical issues)
- Add documentation quality checks for new code
- Use dead code analysis for cleanup phases
- Run duplicate detection periodically for refactoring opportunities

## 🏗️ Architecture

### Core Framework (`core/` directory)
- **base_analyzer.py**: Abstract base classes and common interfaces
- **ast_analyzer.py**: AST parsing utilities and visitor patterns  
- **config_manager.py**: Hierarchical configuration management
- **file_scanner.py**: Intelligent file discovery and filtering
- **reporter.py**: Multi-format output generation

### Modern Python Features
- **Type hints** with Python 3.9+ union syntax (`str | None`)
- **Dataclasses** for clean data structures
- **Pathlib** for cross-platform file handling
- **AST module** for semantic code analysis
- **Enum classes** for better categorization

### Performance Optimizations
- **AST caching** to avoid reparsing files
- **Efficient algorithms** (Tarjan's, winnowing) for complex analysis
- **Parallel processing** support for large codebases
- **Memory-efficient** streaming for large files

## 🤝 Contributing

DinoScan follows modern Python development practices:
- **Type hints** throughout the codebase
- **Comprehensive docstrings** with parameter documentation  
- **Modular architecture** for easy extension
- **Unit testing** with pytest
- **Linting** with the tools it analyzes

To add a new analyzer:
1. Extend `ASTAnalyzer` from `core.base_analyzer`
2. Implement the `analyze_file()` method
3. Add configuration schema to `config.json`
4. Include comprehensive tests and documentation

## 📈 Advantages Over PowerShell Version

### **Accuracy Improvements**
- **AST-based analysis** vs. regex patterns (90%+ reduction in false positives)
- **Semantic understanding** of code structure and relationships
- **Cross-file analysis** for accurate dependency tracking
- **Framework-aware patterns** reduce irrelevant findings

### **Performance Enhancements**  
- **AST caching** for repeated analysis
- **Efficient algorithms** (Tarjan's O(V+E), winnowing O(n))
- **Parallel processing** support for large codebases
- **Memory optimization** for handling large files

### **Integration Benefits**
- **SARIF output** for modern CI/CD platforms
- **JSON/XML export** for tooling integration  
- **Exit codes** for automated workflows
- **Configurable output** for different use cases

### **Maintainability**
- **Modern Python architecture** with type hints and dataclasses
- **Comprehensive configuration system** with validation
- **Modular design** for easy extension and testing
- **Self-documenting code** with extensive docstrings

The new Python-based DinoScan provides enterprise-grade static analysis capabilities with the accuracy and performance needed for modern development workflows.
