Metadata-Version: 2.4
Name: statistics-canada
Version: 2025.6.12.201824
Summary: Python bindings for the Statistics Canada Web Data Service (WDS) API
Author-email: Paul Bouillon <pbouill@gmail.com>
License-Expression: GPL-3.0-or-later
Project-URL: Homepage, https://github.com/pbouill/statistics-canada
Project-URL: Repository, https://github.com/pbouill/statistics-canada.git
Project-URL: Bug Tracker, https://github.com/pbouill/statistics-canada/issues
Keywords: python,statistics,census,canada
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Operating System :: OS Independent
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.28.1
Requires-Dist: pandas>=2.3.0
Dynamic: license-file

# Statistics Canada Python Bindings

[![Python](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

Python bindings for the Statistics Canada Web Data Service (WDS) API, providing easy access to Canadian census data and geographic information.

## Overview

This package provides a Python interface to Statistics Canada's census data through their Web Data Service API. It includes utilities for downloading, processing, and working with Canadian census data, as well as geographic boundaries and administrative divisions.

**Key Features:**
- Access to Canadian census data from 1976-2021
- Geographic level enumerations (provinces, census divisions, etc.)
- Automated data download utilities
- DGUID (Dissemination Geography Unique Identifier) support
- Integration with pandas for data analysis

**Data Sources:**
- [Statistics Canada Web Data Service](https://www.statcan.gc.ca/en/developers/wds)
- [Census geographic attribute files](https://www12.statcan.gc.ca/census-recensement/2021/ref/dict/fig/index-eng.cfm?ID=f1_1)

## Installation

### From PyPI (when available)
```bash
pip install statistics-canada
```

### From source
```bash
git clone https://github.com/pbouill/statistics-canada.git
cd statistics-canada
pip install -e .
```

## Usage

### Basic Usage

```python
import statscan
from statscan.census import CensusYear
from statscan.enums.vintage import Vintage
from statscan.enums.schema import Schema
from statscan.enums.auto.province_territory import ProvinceTerritory

# Access census data for different years
census_year = CensusYear.CENSUS_2021
vintage = Vintage.CENSUS_2021

# Work with geographic levels
province = ProvinceTerritory.ONTARIO
print(f"Province: {province.name}, Code: {province.value}")

# Work with geographic schemas
geo_level = Schema.PR  # Province level
print(f"Geographic level: {geo_level.value}")

# Download and work with data
from statscan.util.data import download_data, unpack_to_dataframe
import asyncio

async def get_census_data():
    # Download geographic attribute files
    data_path = await download_data(
        "https://www12.statcan.gc.ca/census-recensement/2021/geo/aip-pia/attribute-attribs/files-fichiers/2021_92-151_X.zip"
    )
    print(f"Data downloaded to: {data_path}")
    
    # Unpack to DataFrame
    df = unpack_to_dataframe(data_path)
    print(f"Data shape: {df.shape}")

# Run the async function
asyncio.run(get_census_data())
```

### Available Census Years

The package supports census data from:
- 2021 (latest and currently supported)
- Additional years (2016, 2011, 2006, 2001, 1996, 1991, 1986, 1981, 1976) available through legacy CensusYear enum

### Geographic Levels

The package includes enumerations for various Canadian geographic divisions:
- **Schema**: Geographic level codes (provinces, census divisions, etc.)
- **ProvinceTerritory**: Provinces and territories with official codes
- **CensusDivision**: Census division codes
- **CensusSubdivision**: Census subdivision codes  
- **FederalElectoralDistrict**: Federal electoral district codes
- **CensusMetropolitanArea**: CMA codes
- **EconomicRegion**: Economic region codes
- And more auto-generated geographic enumerations...

## Development

### Requirements
- Python 3.11 or later
- httpx >= 0.28.1
- pandas >= 2.3.0

### Setup development environment
```bash
# Clone the repository
git clone https://github.com/pbouill/statistics-canada.git
cd statistics-canada

# Create a virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install development dependencies
pip install -r requirements.dev.txt

# Install package in editable mode
pip install -e .
```

### Running Tests
```bash
# Run all tests
python -m unittest discover unittests

# Run specific test module
python -m unittest unittests.test_core
```

### Project Structure
```
statscan/                 # Main package
├── __init__.py           # Package initialization
├── _version.py           # Version information
├── census.py             # Legacy census year enumerations
├── dguid.py              # DGUID utilities
├── url.py                # API URLs and endpoints
├── py.typed              # Type hint marker
├── enums/                # Geographic enumerations
│   ├── schema.py         # Geographic level schema definitions
│   ├── vintage.py        # Current census vintage/year
│   ├── frequency.py      # Data frequency enumerations
│   ├── auto/             # Auto-generated enums
│   │   ├── province_territory.py    # Province/territory codes
│   │   ├── census_division.py       # Census division codes
│   │   ├── census_subdivision.py    # Census subdivision codes
│   │   ├── federal_electoral_district.py  # FED codes
│   │   ├── census_metropolitan_area.py     # CMA codes
│   │   ├── economic_region.py              # ER codes
│   │   └── ...                             # Other geographic levels
│   └── geocode/          # Geocoding utilities
│       ├── geocode.py    # Base geocode classes
│       ├── pr_geocode.py # Province-specific geocoding
│       └── ...
└── util/                 # Utility modules
    ├── data.py           # Data download and processing utilities
    └── log.py            # Logging configuration
```

## API Reference

### Census Data Access
- **CensusYear**: Legacy enumeration of supported census years (1976-2021)
- **Vintage**: Current census vintage (currently 2021)
- **WDS_BASE_URL**: Base URL for Statistics Canada Web Data Service

### Geographic Enumerations
- **Schema**: Geographic level schema codes (CAN, PR, CD, CSD, etc.)
- **ProvinceTerritory**: Canadian provinces and territories with official codes
- **CensusDivision**: Census division codes by province
- **CensusSubdivision**: Census subdivision codes  
- **FederalElectoralDistrict**: Federal electoral district codes
- **CensusMetropolitanArea**: Census metropolitan area codes
- **EconomicRegion**: Economic region codes

### Data Utilities
- **download_data()**: Async function to download data files from Statistics Canada URLs
- **unpack_to_dataframe()**: Function to unpack downloaded files into pandas DataFrames
- **DGUID utilities**: Functions for working with Dissemination Geography Unique Identifiers

## Data Sources and References

This package provides access to official Statistics Canada data:

- [Web Data Service (WDS) API](https://www.statcan.gc.ca/en/developers/wds)
- [Census Geographic Attribute Files](https://www12.statcan.gc.ca/census-recensement/2021/geo/aip-pia/attribute-attribs/index-eng.cfm)
- [Standard Geographical Classification (SGC)](https://www.statcan.gc.ca/en/subjects/standard/sgc/2021/index)

## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## License
This project is licensed under the GPL-3.0 License - see the LICENSE file for details.
