Metadata-Version: 2.4
Name: buildingid
Version: 2.2.1
Summary: Unique Building Identifier (UBID)
Author: Mark Borkum, Nicholas Long, Katherine Fleming, Alex Swindler
Author-email: Mark Borkum <mark.borkum@pnnl.gov>, Nicholas Long <nicholas.long@nlr.gov>, Katherine Fleming <katherine.fleming@nlr.gov>, Alex Swindler <alex.swindler@nlr.gov>
License-Expression: BSD-2-Clause
License-File: LICENSE.txt
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: GIS
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: click>=8.3.0
Requires-Dist: click-log>=0.4.0
Requires-Dist: openlocationcode>=1.0.0
Requires-Dist: pandas>=2.3.3,<3
Requires-Dist: pyqtree>=1.0.0
Requires-Dist: shapely>=2.0.0
Requires-Dist: tldm==1.1.0
Requires-Python: >=3.11
Description-Content-Type: text/x-rst

=================================
Unique Building Identifier (UBID)
=================================

.. image:: https://github.com/SEED-platform/buildingid/actions/workflows/test.yml/badge.svg
   :target: https://github.com/SEED-platform/buildingid/actions/workflows/test.yml
   :alt: Build Status

**Website:** https://www.pnnl.gov/unique-building-identification

-------------
Documentation
-------------

Install
=======

To complete this guide, `Git <https://git-scm.com/>`_ and `Python 3 <https://www.python.org/>`_ are required.
Dependencies are automatically installed using `pip <https://pypi.python.org/pypi/pip>`_.

Clone the repository, and then install the ``buildingid`` package:

::

  git clone git@github.com:SEED-platform/buildingid.git
  cd buildingid
  uv sync

Test
====

Test the ``buildingid`` package using the `pytest <https://docs.pytest.org/>`_ package:

::

  pytest tests/

Coverage testing is enabled using the `pytest-cov <https://pytest-cov.readthedocs.io/>`_ plugin:

::

  pip3 install pytest-cov
  pytest --cov=buildingid --cov-report=html tests/

Usage
=====

The ``buildingid`` package supports two usages:

* Application programming interface (API)
* Command-line interface (CLI; the ``buildingid`` command)

The API
```````

* ``buildingid.code``

  - ``Code``

  - ``CodeArea``

    + ``area() -> float``

    + ``encode() -> Code``

    + ``intersection(CodeArea) -> typing.Optional[typing.Tuple[float, float, float, float]]``

    + ``jaccard(CodeArea) -> typing.Optional[float]``

    + ``resize() -> CodeArea``

  - ``decode(Code) -> CodeArea``

  - ``encode(float, float, float, float, float, float, **kwargs) -> Code``

  - ``isValid(Code) -> bool``

**Note:** The ``Optional`` and ``Tuple`` type hints are provided by the `typing <https://docs.python.org/3/library/typing.html>`_ module.

In the following example, a UBID code is decoded and then re-encoded:

::

  #!/usr/bin/env python3
  # -*- coding: utf-8 -*-

  import buildingid.code

  if __name__ == '__main__':
    # Initialize UBID code.
    orig_code = '849VQJH6+95J-51-58-42-50'
    print(orig_code)

    # Decode UBID code.
    orig_code_area = buildingid.code.decode(orig_code)
    print(orig_code_area)

    # Resize resulting UBID code area.
    #
    # The effect of this operation is that the length and width of the UBID code
    # area are reduced by half an OLC code area.
    new_code_area = orig_code_area.resize()
    print(new_code_area)

    # Encode new UBID code area.
    new_code = new_code_area.encode()
    print(new_code)

    # Test that new UBID code and UBID code area match the originals.
    assert (orig_code == new_code)
    assert (orig_code_area == new_code_area)

The CLI
```````

View the documentation for the ``buildingid`` command using the ``--help`` command-line option:

::

  buildingid --help
  #=> Usage: buildingid [OPTIONS] COMMAND [ARGS]...
  #=> <<more lines of output>>

View the documentation for a sub-command of the ``buildingid`` command using the ``--help`` command-line option.
For example, to view the documentation for the "append2csv" sub-command:

::

  buildingid append2csv --help
  #=> Usage: buildingid append2csv [OPTIONS] [latlng|wkb|wkt]
  #=> <<more lines of output>>

Commands
^^^^^^^^

+---------------------+--------------------------------------------------------+
| Command name        | Description                                            |
+=====================+========================================================+
| append2csv          | Read CSV file from stdin, append UBID field, and write |
|                     | CSV file to stdout.                                    |
+---------------------+--------------------------------------------------------+
| crossref            | Read two CSV files, cross-reference UBID fields, and   |
|                     | write CSV file.                                        |
+---------------------+--------------------------------------------------------+

---------
Tutorials
---------

Instructions in this section use `Bash <https://www.gnu.org/software/bash/>`_ syntax.

Append UBID field to CSV file
=============================

Prerequisites
`````````````

1. ``buildingid`` command is installed.

   * Verify installation:

     - ``buildingid --version``

       + Expected output: "buildingid, version 2.0.0" (or higher version)

Step-by-step instructions
`````````````````````````

1. Locate input CSV file, e.g., ``path/to/in.csv``.

2. Locate output CSV file (generated), e.g., ``path/to/out.csv``.

3. Locate errors CSV file (generated), e.g., ``path/to/err.csv``.

4. Identify number of digits in `Open Location Code (OLC) <https://plus.codes/>`_ part of UBID code string, e.g., 11.

5. Identify column of output CSV file that contains UBID code strings, e.g., "UBID".

6. If input CSV file contains latitude and longitude coordinates for a centroid only:

   1. Identify columns of input CSV file that contain latitude and longitude coordinates, e.g., "Latitude" and "Longitude".

   2. Assign UBIDs:

      * ``buildingid append2csv latlng --code-length=11 --fieldname-code="UBID" --fieldname-center-latitude="Latitude" --fieldname-center-longitude="Longitude" < path/to/in.csv > path/to/out.csv 2> path/to/err.csv``

7. If input CSV file contains latitude and longitude coordinates for (i) a centroid and (ii) the northeast and southwest corners of a bounding box:

   1. Identify columns of input CSV file that contain latitude and longitude coordinates, e.g., "Latitude_C", "Longitude_C", "Latitude_N", "Longitude_E", "Latitude_S", and "Longitude_W".

   2. Assign UBIDs:

      * ``buildingid append2csv latlng --code-length=11 --fieldname-code="UBID" --fieldname-center-latitude="Latitude_C" --fieldname-center-longitude="Longitude_C" --fieldname-north-latitude="Latitude_N" --fieldname-east-longitude="Longitude_E" --fieldname-south-latitude="Latitude_S" --fieldname-west-longitude="Longitude_W" < path/to/in.csv > path/to/out.csv 2> path/to/err.csv``

8. If input CSV file contains hex-encoded `well-known binary (WKB) <https://www.iso.org/standard/60343.html>`_ strings:

   1. Identify column of input CSV file that contains hex-encoded WKB strings, e.g., "WKB".

   2. Assign UBIDs:

      * ``buildingid append2csv wkb --code-length=11 --fieldname-code="UBID" --fieldname-wkbstr="WKB" < path/to/in.csv > path/to/out.csv 2> path/to/err.csv``

9. If input CSV file contains `well-known text (WKT) <https://www.iso.org/standard/60343.html>`_ strings:

   1. Identify column of input CSV file that contains WKT strings, e.g., "WKT".

   2. Assign UBIDs:

      * ``buildingid append2csv wkt --code-length=11 --fieldname-code="UBID" --fieldname-wktstr="WKT" < path/to/in.csv > path/to/out.csv 2> path/to/err.csv``

Notes
`````

See ``buildingid append2csv --help`` for full help.

Cross-reference UBID fields in two CSV files
============================================

Prerequisites
`````````````

1. ``buildingid`` command is installed.

   * Verify installation:

     - ``buildingid --version``

       + Expected output: "buildingid, version 2.0.0" (or higher version)

Step-by-step instructions
`````````````````````````

1. Locate left input CSV file, e.g., ``path/to/left.csv``.

2. Locate right input CSV file, e.g., ``path/to/right.csv``.

3. Locate output CSV file (generated), e.g., ``path/to/out.csv``.

4. Identify column of left input CSV file that contains UBID code strings, e.g., "UBID".

5. Identify column of right input CSV file that contains UBID code strings, e.g., "UBID".

6. Cross-reference UBIDs:

   * ``buildingid crossref path/to/left.csv path/to/right.csv path/to/out.csv --left-fieldname-code="UBID" --right-fieldname-code="UBID"``

Notes
`````

See ``buildingid crossref --help`` for full help.

Default behavior is for output CSV file to be many-to-many (i.e., many records in left input CSV file are cross-referenced with many records in right input CSV file).
Use ``--left-group-by-jaccard`` and ``--right-group-by-jaccard`` options for one-to-many and many-to-one, respectively.

Default behavior is for output CSV file to include only columns that contain UBID code strings.
Use ``--include-left-field`` and ``--include-right-field`` options to include other columns.

Convert from Esri shapefile to CSV file
=======================================

Prerequisites
`````````````

1. `Geospatial Data Abstraction Library (GDAL) <https://www.gdal.org/>`_ is installed.

   * Verify installation:

     - ``ogr2ogr --version``

       + Expected output: "GDAL 2.3.1, released 2018/06/22" (version and release date may vary)

Step-by-step instructions
`````````````````````````

1. Locate input Esri shapefile, e.g., ``path/to/in.shp``.

2. Locate input PRJ file, e.g., ``path/to/in.prj``.

3. Locate output CSV file (generated), e.g., ``path/to/out.csv``.

4. Convert input Esri shapefile into output CSV file:

   * ``ogr2ogr -s_srs "$(cat path/to/in.prj)" -t_srs "EPSG:4326" -f CSV path/to/out.csv path/to/in.shp -lco GEOMETRY=AS_WKT``

Notes
`````

See ``ogr2ogr --long-usage`` for full help.

Output CSV file has added "WKT" column whose elements are `well-known text (WKT) <https://www.iso.org/standard/60343.html>`_ strings; enabled by ``-lco GEOMETRY=AS_WKT`` option.

Projection system for geographic coordinates in output CSV file is `WGS84 <https://epsg.io/4326>`_; enabled by ``-t_srs "EPSG:4326"`` option.

Records in input Esri shapefile are converted into rows in output CSV file, where fields in input Esri shapefile are converted into columns in output CSV file.

Shapes in input Esri shapefile are converted into elements of "WKT" column of output CSV file.

------------
Case Studies
------------

Chicago, IL
===========

`The City of Chicago's open data portal <https://data.cityofchicago.org>`_ hosts the `"Building Footprints (current)" <https://data.cityofchicago.org/Buildings/Building-Footprints-current-/hz9b-7nh8>`_ dataset in CSV format; available at: https://data.cityofchicago.org/api/views/syp8-uezg/rows.csv?accessType=DOWNLOAD.

The "the_geom" column of the input CSV file contains WKT strings.

To assign UBIDs to the records in the input CSV file:

1. ``buildingid append2csv wkt --code-length=11 --fieldname-code="UBID" --fieldname-wktstr="the_geom" < rows.csv > rows.out.csv 2> rows.err.csv``

San Jose, CA
============

The `City of San Jose <http://www.sanjoseca.gov>`_ hosts `datasets <http://www.sanjoseca.gov/index.aspx?NID=3308>`_ that include building footprints and land parcels.

The contents of the `"Basemap_2" <http://www.sanjoseca.gov/DocumentCenter/View/44895>`_ zip archive includes a building footprints dataset in Esri shapefile format.
The coordinate system is `NAD 1983 StatePlane California III FIPS 0403 Feet <http://www.spatialreference.org/ref/esri/102643/>`_.

To convert the Esri shapefile into CSV format and then assign UBIDs to the resulting CSV file:

1. ``ogr2ogr -s_srs "$(cat Basemap2_201905021152225992/BuildingFootprint.prj)" -t_srs "EPSG:4326" -f CSV BuildingFootprint.csv Basemap2_201905021152225992/BuildingFootprint.shp -lco GEOMETRY=AS_WKT``

2. ``buildingid append2csv wkt --code-length=11 --fieldname-code="UBID" --fieldname-wktstr="WKT" < BuildingFootprint.csv > BuildingFootprint.out.csv 2> BuildingFootprint.err.csv``

-------
License
-------

`The 2-Clause BSD License <https://opensource.org/licenses/BSD-2-Clause>`_

-------------
Contributions
-------------

Contributions are accepted on `GitHub <https://github.com/>`_ via the fork and pull request workflow.
See `here <https://help.github.com/articles/using-pull-requests/>`_ for more information.
