Metadata-Version: 2.4
Name: artip
Version: 0.1.2
Summary: Scalar temporal period identifiers for Power BI
Author-email: EDF DSO <dso@edf.fr>
License: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: polars>=0.20.0
Provides-Extra: dev
Requires-Dist: mypy>=1.5.0; extra == 'dev'
Requires-Dist: pylint>=2.17.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
Requires-Dist: pytest>=7.4.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# Artip – Scalar Temporal Period Identifiers for Power BI

Artip is a Python library that generates unique, scalar integer identifiers for temporal periods, optimized for storage and filtering in Power BI's Vertipaq engine.

## Features

- **Scalar Period IDs**: Encode temporal periods as 64-bit integers for efficient Power BI representation
- **Flexible Granularity**: Support for year, month, day, hour, minute, and second-level periods
- **Period Constraints**: Optional maximum period size enforcement
- **Calendar Generation**: Automatic generation of periodic calendar tables for Power BI filtering
- **Type-Safe**: Full Python type hints for IDE integration and runtime checks
- **No Global State**: Thread-safe with all state encapsulated in Artip instances

## Installation

```bash
pip install artip
```

## Quick Start

```python
from datetime import date
from artip import Artip

# Create an Artip instance
artip = Artip(
    min_time=date(2024, 1, 1),
    max_time=date(2024, 12, 31),
    granularity='day'
)

# Generate a period identifier
period_id = artip.make_period_id(
    start=date(2024, 1, 1),
    end=date(2024, 1, 31)
)
print(f"Period ID: {period_id}")  # Output: Period ID: 101

# Retrieve the time range for a period
start, end = artip.get_period_range(period_id)
print(f"Period range: {start} to {end}")

# Generate a calendar table for Power BI
calendar = artip.make_calendar_table()
print(calendar)
```

## Use Cases

### ETL Pipeline Integration

Use Artip in Python ETL pipelines to generate period identifiers for facts tables:

```python
from datetime import date
import polars as pl
from artip import Artip

# Setup period encoding
artip = Artip(
    min_time=date(2020, 1, 1),
    max_time=date(2030, 12, 31),
    granularity='month'
)

# Process fact records
facts = pl.DataFrame({
    'transaction_date': [date(2024, 3, 15), date(2024, 5, 20)],
    'amount': [100.0, 250.0]
})

def add_period_id(record):
    return artip.make_period_id(record['transaction_date'], record['transaction_date'])

facts = facts.with_columns(
    period_id=facts['transaction_date'].map_elements(
        lambda d: artip.make_period_id(d, d).value
    )
)
```

### Calendar Table for Power BI

Generate a calendar table to enable period-based filtering:

```python
calendar_table = artip.make_calendar_table()

# Save to CSV for Power BI import
calendar_table.write_csv('period_calendar.csv')
```

## API Reference

### `Artip` Class

#### Constructor

```python
Artip(
    min_time: Union[date, datetime],
    max_time: Union[date, datetime],
    granularity: str = 'day',
    period_max_size: Optional[int] = None
)
```

**Parameters:**
- `min_time`: Minimum time boundary (inclusive)
- `max_time`: Maximum time boundary (inclusive)
- `granularity`: Temporal granularity ('year', 'month', 'day', 'hour', 'minute', 'second')
- `period_max_size`: Maximum allowed period length in granularity units (None = unlimited)

**Raises:**
- `InvalidIntervalError`: If min_time >= max_time
- `InvalidGranularityError`: If granularity is not supported

#### `make_period_id(start, end) -> PeriodId`

Generate a unique identifier for a temporal period.

**Parameters:**
- `start`: Period start time (inclusive)
- `end`: Period end time (inclusive)

**Returns:** `PeriodId` object

**Raises:**
- `InvalidIntervalError`: If start > end or period outside bounds
- `PeriodTooLargeError`: If period exceeds period_max_size

#### `get_period_range(period_id) -> Tuple[datetime, datetime]`

Retrieve the time range for a period identifier.

**Parameters:**
- `period_id`: A PeriodId generated by this instance

**Returns:** Tuple of (start_time, end_time)

**Raises:**
- `InvalidIntervalError`: If period_id not generated by this instance

#### `make_calendar_table() -> pl.DataFrame`

Generate a periodic calendar table for Power BI filtering.

**Returns:** Polars DataFrame with columns:
- `date`: The timepoint at the specified granularity
- `period_id`: The PeriodId value (Int64)
- `label`: Relationship to period ('starts', 'ends', 'in')

### `PeriodId` Class

Lightweight dataclass representing a period identifier.

**Attributes:**
- `value: int` – The encoded period identifier

**Methods:**
- `__str__()` – Returns string representation
- `__eq__(other)` – Equality comparison
- `__hash__()` – Hashable for use in sets/dicts

## Exceptions

- `ArtipError` – Base exception
- `InvalidIntervalError` – Invalid time interval
- `PeriodTooLargeError` – Period exceeds constraints
- `InvalidGranularityError` – Unsupported granularity

## Design Rationale

Artip encodes periods as composite integers:

```
PeriodId = (start_id) * 10^period_size + period_length
```

Where:
- `start_id` = offset from min_time in granularity units
- `period_length` = end_id - start_id
- `period_size` = number of digits allocated for period_length

This design ensures:
- Efficient storage in 64-bit integers
- Fast period range queries
- Compatible filtering in Power BI

## Requirements

- Python 3.10+
- Polars 0.20.0+

## License

MIT

## Contributing

Contributions are welcome! Please ensure:
- All tests pass: `pytest`
- Code is type-checked: `mypy`
- Code adheres to style: `ruff`
