Metadata-Version: 2.4
Name: ai-compat
Version: 0.3.1
Summary: GPU and TPU compatibility toolkit for AI frameworks
Author: PyLib Contributors
License: MIT
Project-URL: Homepage, https://github.com/upendra-manike/PyLib
Project-URL: Repository, https://github.com/upendra-manike/PyLib
Keywords: gpu,tpu,cuda,ai,pytorch,tensorflow,onnx,compatibility,cloud-tpu,edge-tpu
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: System :: Hardware :: Hardware Drivers
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# ai-compat

AI GPU and TPU compatibility toolkit that inspects, tests, and auto-fixes CUDA/driver mismatches and TPU configurations for major AI frameworks.

## Features

### GPU Support
- GPU + CUDA detection (`nvidia-smi`, CUDA paths, cuDNN)
- Framework scanner for PyTorch, TensorFlow, ONNX Runtime, diffusers, transformers
- Compatibility checker with JSON rules
- Auto-fix suggestions + optional pip installs
- GPU diagnostics (PyTorch/TensorFlow/ONNX/VRAM tests)

### TPU Support
- **Cloud TPU** detection via `gcloud` CLI
- **Edge TPU** detection (USB/PCIe devices, pycoral)
- TensorFlow TPU compatibility checking
- TPU diagnostics (Cloud TPU, Edge TPU, TensorFlow TPU)
- Auto-fix suggestions for TPU setup

### System Resources
- **RAM detection**: Total and available system memory
- **Disk usage**: Total, used, and free disk space
- Automatic resource monitoring for AI workload planning
- Uses `psutil` when available (falls back to system calls)

### General
- Environment file exporter (`gpu-env.txt` or `tpu-env.txt`)
- CLI entry point: `ai-compat`
- Works with both GPU and TPU simultaneously

## Quickstart

```bash
pip install ai-compat
ai-compat scan          # Scan system (GPU, TPU, RAM, disk)
ai-compat check         # Check compatibility issues
ai-compat fix --apply   # Auto-fix issues
ai-compat test          # Run all tests (GPU + TPU)
ai-compat test --gpu-only  # Run only GPU tests
ai-compat test --tpu-only  # Run only TPU tests
ai-compat export --output env.txt
```

## System Resources (RAM & Disk Usage)

### View System Resources

The `scan` command automatically includes RAM and disk information:

```bash
ai-compat scan
```

To extract just the resources section:

```bash
# On Linux/macOS
ai-compat scan | grep -A 6 '"resources"'

# Or use jq (if installed)
ai-compat scan | jq '.resources'
```

### RAM Memory Usage

Check your system's RAM capacity and availability:

```bash
$ ai-compat scan | jq '.resources'
{
  "ram_total_gb": 32.0,      # Total system RAM
  "ram_available_gb": 24.5,  # Available RAM for use
  "disk_total_gb": 500.0,    # Total disk space
  "disk_used_gb": 150.0,     # Used disk space
  "disk_free_gb": 350.0      # Free disk space
}
```

### Python API for RAM Usage

```python
from ai_compat import scan_system

snapshot = scan_system()
resources = snapshot.resources

print(f"Total RAM: {resources.ram_total_gb} GB")
print(f"Available RAM: {resources.ram_available_gb} GB")
print(f"RAM Usage: {((resources.ram_total_gb - resources.ram_available_gb) / resources.ram_total_gb * 100):.1f}%")
print(f"Free Disk: {resources.disk_free_gb} GB")
```

### Use Cases

- **Model Loading**: Check if you have enough RAM before loading large models
- **Batch Size Planning**: Determine optimal batch sizes based on available memory
- **Disk Space**: Verify sufficient space for model downloads and checkpoints
- **Resource Monitoring**: Track system resources in CI/CD pipelines

## Example Output

```
ai-compat check
{
  "issues": [
    {
      "framework": "PyTorch",
      "message": "PyTorch 2.2.1 requires CUDA ['12.1', '12.2'] but system has 11.8",
      "severity": "error",
      "suggestion": "Install CUDA 12.1/12.2 or install PyTorch wheel matching CUDA 11.8"
    }
  ],
  "summary": "Detected 1 issue(s)",
  "metadata": {
    "gpu_count": 1,
    "cuda_version": "11.8",
    "driver_version": "535.104"
  }
}
```

## Architecture

```
ai_compat/
  cli.py        # command-line interface
  scanner.py    # system + framework inspection
  gpu.py        # low-level GPU detection
  tpu.py        # TPU detection (Cloud + Edge)
  checker.py    # rules-based compatibility engine
  fixer.py      # auto-fix planner
  tester.py     # GPU + TPU diagnostics
  exporter.py   # environment generator
  rules/
    cuda_rules.json
    pytorch_rules.json
    tensorflow_rules.json
    tpu_rules.json
```

## TPU Detection

### Cloud TPU
- Requires `gcloud` CLI installed and configured
- Detects TPU via `gcloud compute tpus list`
- Checks connectivity and TensorFlow TPUClusterResolver access

### Edge TPU
- Detects USB/PCIe Edge TPU devices
- Checks for `/dev/apex_0` device
- Requires `pycoral` for full functionality

## Limitations

- Requires `nvidia-smi` for NVIDIA GPU detection
- Cloud TPU detection requires `gcloud` CLI
- Edge TPU detection requires `pycoral` for full functionality
- System resource snapshot uses `psutil` when available (falls back to `/proc`/sysconf)
- Auto-fix commands run via `pip`; `--apply` executes them (use with caution)
- VRAM stress test relies on PyTorch
- Rules JSON provides conservative reference mappings; update as needed

## Example: Full System Scan (GPU + TPU + RAM + Disk)

```bash
$ ai-compat scan
{
  "platform": "Linux 5.15.0",
  "python_version": "3.10.12",
  "resources": {
    "ram_total_gb": 32.0,
    "ram_available_gb": 24.5,
    "disk_total_gb": 500.0,
    "disk_used_gb": 150.0,
    "disk_free_gb": 350.0
  },
  "gpu": {
    "gpu_count": 1,
    "gpus": [{"name": "NVIDIA RTX 4090", "memory_total_gb": 24.0}],
    "cuda": {"version": "12.1", "cudnn_version": "8.9"}
  },
  "tpu": {
    "tpu_count": 1,
    "has_cloud_tpu": true,
    "has_edge_tpu": false,
    "cloud_tpu_available": true,
    "tpus": [{"type": "cloud", "accelerator_type": "v2-8"}]
  },
  "frameworks": {
    "tensorflow": {
      "version": "2.16.0",
      "gpu_available": true,
      "tpu_available": true
    }
  }
}
```

### Quick RAM Check Command

For a quick RAM check, you can use:

```bash
# View only RAM information
ai-compat scan | jq '.resources | {ram_total_gb, ram_available_gb, ram_usage_percent: ((.ram_total_gb - .ram_available_gb) / .ram_total_gb * 100)}'

# Or on systems without jq
ai-compat scan | python3 -c "import sys, json; d=json.load(sys.stdin); r=d['resources']; print(f\"RAM: {r['ram_available_gb']:.1f}GB / {r['ram_total_gb']:.1f}GB available\")"
```

Contributions welcome!
