Metadata-Version: 2.4
Name: aiwebexplorer
Version: 0.0.4
Summary: An agent for agents to explore the web
Author-email: Francesco Vertemati <verte.fra@gmail.com>
Requires-Python: >=3.12
Description-Content-Type: text/markdown
Requires-Dist: agno>=2.0.3
Requires-Dist: aiohttp>=3.12.15
Requires-Dist: bs4>=0.0.2
Requires-Dist: cachetools>=6.2.0
Requires-Dist: openai>=1.107.1
Requires-Dist: playwright>=1.55.0
Requires-Dist: playwright-stealth>=2.0.0
Requires-Dist: pydantic>=2.11.7
Requires-Dist: pydantic-settings>=2.10.1
Requires-Dist: structlog>=25.4.0

[![WebExplorer CI/CD](https://github.com/thinktwiceco/webexplorer/actions/workflows/ci.yml/badge.svg)](https://github.com/thinktwiceco/webexplorer/actions/workflows/ci.yml)

[![New Version Deployed](https://github.com/thinktwiceco/webexplorer/actions/workflows/version.yml/badge.svg?branch=master)](https://github.com/thinktwiceco/webexplorer/actions/workflows/version.yml)


# AIWebExplorer 🌐

An agent for agents to explore the web

## 📦 Installation

This project uses `uv` for dependency management.

```bash
# Clone the repository
git clone <repository-url>
cd AIWebExplorer

# Install dependencies
uv sync

# Activate virtual environment
source .venv/bin/activate
```

## 🛠️ Development

```bash
# Run linting
uv run ruff check .

# Run formatting
uv run ruff format .

# Run type checking
uv run ruff check --select I
```

## ⚙️ Environment Variables

Copy `.env.example` to `.env` and adjust the values:

```env
# Environment setting (DEV, TEST, CI, PROD)
AWE_ENV=DEV

# Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
AWE_LOG_LEVEL=INFO

# LLM Provider configuration (REQUIRED)
# Supported providers: openai, togetherai, deepseek
AWE_LLM_PROVIDER=openai

# LLM Model to use (REQUIRED)
# Examples:
# - OpenAI: gpt-4, gpt-4-turbo, gpt-3.5-turbo
# - TogetherAI: meta-llama/Llama-2-70b-chat-hf, mistralai/Mixtral-8x7B-Instruct-v0.1
# - DeepSeek: deepseek-chat, deepseek-coder
AWE_LLM_MODEL=gpt-4

# API Key for the selected provider (REQUIRED)
AWE_LLM_API_KEY=your-api-key-here
```

### Configuration Options

- **`AWE_ENV`**: Application environment (default: `DEV`)
  - Options: `DEV`, `TEST`, `CI`, `PROD`
- **`AWE_LOG_LEVEL`**: Logging verbosity level (default: `INFO`)
  - Options: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`
- **`AWE_LLM_PROVIDER`**: LLM provider selection (**REQUIRED**)
  - Supported providers: `openai`, `togetherai`, `deepseek`
  - Can be overridden when creating agents programmatically
- **`AWE_LLM_MODEL`**: Model identifier to use (**REQUIRED**)
  - Must be compatible with the selected provider
  - Can be overridden when creating agents programmatically
- **`AWE_LLM_API_KEY`**: API key for authentication (**REQUIRED**)
  - Must be valid for the selected provider
  - Can be overridden when creating agents programmatically

### Supported Providers

#### 🤖 OpenAI
- **Provider**: `openai`
- **Models**: `gpt-4`, `gpt-4-turbo`, `gpt-3.5-turbo`, and more
- **API Key**: Get from [OpenAI Platform](https://platform.openai.com/)

#### 🔗 TogetherAI
- **Provider**: `togetherai`
- **Models**: Various open-source models including Llama, Mixtral, etc.
- **API Key**: Get from [Together.ai](https://together.ai/)

#### 🔍 DeepSeek
- **Provider**: `deepseek`
- **Models**: `deepseek-chat`, `deepseek-coder`
- **API Key**: Get from [DeepSeek Platform](https://platform.deepseek.com/)

## 🧪 Testing

For comprehensive testing documentation, including how to run tests, use dependency injection for mocking, and write new tests, see the [Tests README](tests/README.md).

### Running Tests

```bash
# Run all tests
pytest

# Run with verbose output
pytest -v

# Run specific test file
pytest tests/test_webexplorer_integration.py
```

### 📊 Evaluation Reports

Performance evaluation reports are available in the [`tests/reports/`](tests/reports/) directory:

- [**Amazon Extraction Report**](tests/reports/amazon_extraction_report.md) - Evaluation of product information extraction from Amazon
- [**Wikipedia Extraction Report**](tests/reports/wikipedia_extraction_report.md) - Evaluation of information extraction from Wikipedia

These reports track the accuracy and performance of the WebExplorer across different types of websites and extraction tasks.

## ✨ New Features

To develop a new feature:

1. **Create a feature branch from `develop`:**
   ```bash
   git checkout develop
   git pull origin develop
   git checkout -b feature/your-feature-name
   ```

2. **Work on your feature and commit changes:**
   ```bash
   git add .
   git commit -m "feat: add your new feature"
   git push origin feature/your-feature-name
   ```

3. **Create a Pull Request to `develop` branch**
4. **After review and merge, delete the feature branch**

## 🚀 New Versions

### Option 1: Automated Release (Recommended)

For automated releases, simply commit with the release message:

```bash
git commit -m "chore: release v1.2.0"
git push origin master
```

This will automatically:
- Create the version tag
- Publish to PyPI
- Create a GitHub release

### Option 2: Manual Release

1. **Create a release branch from `develop`:**
   ```bash
   git checkout develop
   git pull origin develop
   git checkout -b release/v1.2.0
   ```

2. **Update CHANGELOG.md with your changes**

3. **Merge to master and create version tag:**
   ```bash
   git checkout master
   git merge release/v1.2.0
   git tag v1.2.0
   git push origin master --tags
   ```

4. **Merge back to develop:**
   ```bash
   git checkout develop
   git merge release/v1.2.0
   git push origin develop
   ```

The CI/CD pipeline will automatically:
- Run tests and linting
- Build and publish to PyPI when version tags are pushed
- Create GitHub releases

**Version numbering:**
- **Patch** (1.0.0 → 1.0.1): Bug fixes
- **Minor** (1.0.0 → 1.1.0): New features
- **Major** (1.0.0 → 2.0.0): Breaking changes

## License

[Add your license here]
