Metadata-Version: 2.4
Name: howlongtobeat-scraper
Version: 1.1.0
Summary: Un scraper para obtener los tiempos de juego desde HowLongToBeat.com.
Author-email: Sermodi <sermodsoftware@gmail.com>
Project-URL: Homepage, https://github.com/Sermodi/HowLongToBeat_scraper
Project-URL: Bug Tracker, https://github.com/Sermodi/HowLongToBeat_scraper/issues
Project-URL: Repository, https://github.com/Sermodi/HowLongToBeat_scraper
Project-URL: Documentation, https://github.com/Sermodi/HowLongToBeat_scraper#readme
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Topic :: Games/Entertainment
Classifier: Topic :: Internet :: WWW/HTTP :: Browsers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: playwright>=1.40.0
Requires-Dist: beautifulsoup4>=4.12.0
Requires-Dist: lxml>=4.9.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Requires-Dist: pre-commit>=3.4.0; extra == "dev"
Requires-Dist: bandit>=1.7.5; extra == "dev"
Provides-Extra: test
Requires-Dist: pytest>=7.4.0; extra == "test"
Requires-Dist: pytest-cov>=4.1.0; extra == "test"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "test"
Dynamic: license-file

# HowLongToBeat Scraper

[![PyPI version](https://badge.fury.io/py/howlongtobeat-scraper.svg)](https://badge.fury.io/py/howlongtobeat-scraper)
[![Python versions](https://img.shields.io/pypi/pyversions/howlongtobeat-scraper.svg)](https://pypi.org/project/howlongtobeat-scraper/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://pepy.tech/badge/howlongtobeat-scraper)](https://pepy.tech/project/howlongtobeat-scraper)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

A Python package to get game completion times from [HowLongToBeat](https://howlongtobeat.com).

This package provides both a command-line tool and a Python API to look up a game and retrieve its estimated times for main story, extras, and 100% completion.

## Features

-   **Command-Line Interface (CLI)**: Get game times directly from your terminal.
-   **Python API**: Easily integrate HowLongToBeat functionality into your own Python scripts.
-   **Asynchronous**: Built on `asyncio` and `playwright` for efficient performance.
-   **Structured Data**: Returns data in a `dataclass` for easy access.

## Installation

### From PyPI (Official Release)

Install the package from the official Python Package Index:

```bash
pip install howlongtobeat-scraper
```

After installation, you need to install Playwright browsers:

```bash
playwright install
```

**Note**: The package is now officially available on PyPI at: https://pypi.org/project/howlongtobeat-scraper/

### From Source (for Development)

If you want to contribute or install the latest development version, you can clone the repository and install it in editable mode:

```bash
git clone https://github.com/Sermodi/HowLongToBeat_scraper.git
cd HowLongToBeat_scraper
pip install -e .
```

## Usage

### Command-Line Interface (CLI)

Once installed, you can run the package as a module:

```bash
python -m howlongtobeat_scraper "The Witcher 3: Wild Hunt"
```

**Note**: Use the module format above as it works consistently across all platforms.

**Example Output:**

```
Searching for "The Witcher 3: Wild Hunt"...
Title: The Witcher 3: Wild Hunt
- Main Story: 51.5 hours
- Main + Extras: 103 hours
- Completionist: 172 hours
```

### Python API

The package provides two main functions for retrieving game data:

#### Recommended: `get_game_stats_smart` (with automatic fallback)

This is the **recommended** function that automatically handles browser visibility for you:

```python
from __future__ import annotations
from howlongtobeat_scraper.api import get_game_stats_smart, GameData

def main():
    game_name = "Celeste"
    print(f"--- Fetching data for: {game_name} ---")

    try:
        # Smart function with automatic fallback
        # Tries headless first, falls back to visible mode if needed
        game_data: GameData | None = get_game_stats_smart(game_name)
        
        if game_data:
            print("API call successful. Data received:")
            print(f"  Title: {game_data.title}")
            print(f"  Main Story: {game_data.main_story} hours")
            print(f"  Main + Extras: {game_data.main_extra} hours")
            print(f"  Completionist: {game_data.completionist} hours")
        else:
            print("No data found for the game.")
    except Exception as e:
        print(f"An error occurred: {e}")

if __name__ == "__main__":
    main()
```

#### Manual control: `get_game_stats`

For manual control over browser visibility, you can use the original function:

```python
from howlongtobeat_scraper.api import get_game_stats

# Always headless (invisible browser)
game_data = get_game_stats("Game Name")

# Always visible browser (for debugging or when headless fails)
game_data = get_game_stats("Game Name", headless=False)
```

### Browser Visibility and Fallback Strategy

#### Automatic Fallback (Recommended)

The `get_game_stats_smart` function implements an intelligent fallback strategy:

1. **First attempt**: Tries headless mode (invisible browser) for better performance
2. **Automatic fallback**: If headless fails due to bot detection, automatically retries with visible browser
3. **User-friendly**: Minimizes browser visibility while ensuring reliability

```python
# Recommended: automatic fallback strategy
data = get_game_stats_smart("Game Name")
```

#### Manual Control

For specific use cases, you can manually control browser visibility with `get_game_stats`:

- **`get_game_stats("Game Name")`**: Always uses headless mode (invisible)
- **`get_game_stats("Game Name", headless=False)`**: Always shows browser window

```python
# Always headless (faster but may fail on some sites)
data = get_game_stats("Game Name")

# Always visible (more reliable but shows browser window)
data = get_game_stats("Game Name", headless=False)
```

**Recommendation**: Use `get_game_stats_smart()` for the best balance of performance and reliability.

## Spanish Documentation

A Spanish version of this README is available at [README.es.md](README.es.md).
