Metadata-Version: 2.4
Name: earthforge
Version: 1.0.0
Summary: Cloud-native geospatial developer toolkit
Project-URL: Homepage, https://github.com/chrislyonsKY/earthForge
Project-URL: Repository, https://github.com/chrislyonsKY/earthForge
Project-URL: Documentation, https://chrislyonsky.github.io/earthForge/
Project-URL: Issues, https://github.com/chrislyonsKY/earthForge/issues
Author: EarthForge Contributors
License-Expression: GPL-3.0-or-later
License-File: LICENSE
Keywords: cli,cloud-native,cog,geoparquet,geospatial,stac
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: GIS
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: earthforge-core>=0.1.0
Provides-Extra: all
Requires-Dist: earthforge-cli>=0.1.0; extra == 'all'
Requires-Dist: earthforge-cube>=0.1.0; extra == 'all'
Requires-Dist: earthforge-raster>=0.1.0; extra == 'all'
Requires-Dist: earthforge-stac>=0.1.0; extra == 'all'
Requires-Dist: earthforge-vector>=0.1.0; extra == 'all'
Provides-Extra: cli
Requires-Dist: earthforge-cli>=0.1.0; extra == 'cli'
Provides-Extra: cube
Requires-Dist: earthforge-cube>=0.1.0; extra == 'cube'
Provides-Extra: dev
Requires-Dist: coverage>=7.0; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: respx>=0.21; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: raster
Requires-Dist: earthforge-raster>=0.1.0; extra == 'raster'
Provides-Extra: stac
Requires-Dist: earthforge-stac>=0.1.0; extra == 'stac'
Provides-Extra: vector
Requires-Dist: earthforge-vector>=0.1.0; extra == 'vector'
Description-Content-Type: text/markdown

# EarthForge

![EarthForge Banner](branding/earthforge-banner.png)

[![License: GPL v3](https://img.shields.io/badge/license-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0.html)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![CI](https://img.shields.io/github/actions/workflow/status/chrislyonsKY/earthForge/ci.yml?branch=main&label=CI)](https://github.com/chrislyonsKY/earthForge/actions)
[![PyPI](https://img.shields.io/pypi/v/earthForge.svg)](https://pypi.org/project/earthForge/) 
[![Hatch](https://img.shields.io/badge/build-hatch-4051b5.svg)](https://hatch.pypa.io/)


Working with cloud-native geospatial data means juggling `gdalinfo` for COGs, `stac-client` for discovery, `geopandas` for GeoParquet, `xarray` for Zarr, and a collection of one-off scripts to glue them together. Each tool has its own CLI conventions, its own output format, and its own assumptions about how you authenticate to cloud storage.

EarthForge is a single composable toolkit that unifies these workflows. One CLI. One config system. One output contract. Every command works locally, against S3, GCS, or Azure — and every command produces both human-readable tables and machine-parseable JSON.

```bash
# Inspect any cloud-native geospatial file — format auto-detected
earthforge info s3://bucket/image.tif
earthforge info buildings.parquet
earthforge info climate.zarr

# Search STAC catalogs
earthforge stac search sentinel-2-l2a --bbox -85,37,-84,38 --datetime 2025-06/2025-09

# Generate a quicklook preview from a remote COG without downloading it
earthforge raster preview s3://bucket/scene.tif -o preview.png

# Convert legacy formats to cloud-native
earthforge vector convert buildings.shp --to geoparquet
earthforge raster convert image.tif --to cog

# Query GeoParquet with spatial predicate pushdown
earthforge vector query buildings.parquet --bbox -85,37,-84,38

# Inspect and slice Zarr datacubes
earthforge cube info s3://era5-pds/zarr/2025/01/data/air_temperature_at_2_metres.zarr
earthforge cube slice s3://era5-pds/zarr/ --var t2m --bbox -85,37,-84,38 --time 2025-06/2025-06 -o ky_june.zarr

# Pipe structured JSON into other tools
earthforge stac search sentinel-2-l2a -o json | jq '.items[].assets.B04.href'
```

## What EarthForge Is

EarthForge is a **library-first, CLI-first developer toolkit**. Install it as a Python library and call functions directly, or use the CLI from shell scripts and pipelines. Every CLI command is a thin wrapper around a library function, so anything you can do from the terminal you can also do from Python, a Jupyter notebook, or a pipeline runner.

```python
from earthforge.raster.info import inspect_raster
from earthforge.stac.search import search_catalog

# Library usage — same logic as the CLI, no subprocess needed
items = await search_catalog("sentinel-2-l2a", bbox=(-85, 37, -84, 38))
metadata = await inspect_raster("s3://bucket/scene.tif")
```

## Real-World Output

The samples below are actual outputs from EarthForge commands run against public geospatial data. Sample files live in [`data/samples/`](data/samples/).

### KyFromAbove 3-inch Orthoimagery — fetched thumbnail

```bash
earthforge stac fetch \
  https://spved5ihrl.execute-api.us-west-2.amazonaws.com/collections/orthos-phase3/items/N097E305_2024_Season1_3IN_cog \
  --assets thumbnail --output-dir data/kyfromabove_fetch
# → 78,026 bytes in 2.34s
```

![KyFromAbove 3-inch orthoimagery thumbnail — rural central Kentucky](data/samples/kyfromabove_preview.png)

*3-inch orthoimagery, KyFromAbove Phase 3 (2024). Public domain. Full COG available at `kyfromabove.s3.us-west-2.amazonaws.com`.*

---

### Sentinel-2 STAC Search — `--output json`

```bash
earthforge stac search sentinel-2-l2a \
  --bbox -85,37,-84,38 --datetime 2025-06/2025-09 --max-items 5 \
  --output json
```

```json
{
  "collection": "sentinel-2-l2a",
  "matched": 47,
  "returned": 5,
  "elapsed_seconds": 1.243,
  "items": [
    {
      "id": "S2A_18SYJ_20250914_0_L2A",
      "datetime": "2025-09-14T16:28:43Z",
      "properties": { "eo:cloud_cover": 4.2, "platform": "sentinel-2a" }
    }
  ]
}
```

Full sample: [`data/samples/stac_search.json`](data/samples/stac_search.json)

---

### COG Metadata — `earthforge raster info`

```bash
earthforge raster info \
  https://sentinel-cogs.s3.us-west-2.amazonaws.com/.../B04.tif \
  --output json
```

```json
{
  "format": "COG",
  "width": 10980, "height": 10980,
  "crs": "EPSG:32618",
  "is_tiled": true, "tile_width": 512, "tile_height": 512,
  "overview_count": 6,
  "compression": "deflate"
}
```

Full sample: [`data/samples/raster_info.json`](data/samples/raster_info.json)

---

### GeoParquet Metadata — `earthforge vector info`

```bash
earthforge vector info ky_wildlife_management_areas.parquet --output json
```

```json
{
  "format": "geoparquet",
  "row_count": 83,
  "geometry_types": ["MultiPolygon"],
  "crs": "EPSG:4326",
  "bbox": [-89.57, 36.49, -81.96, 39.15],
  "compression": "SNAPPY",
  "file_size_bytes": 142863
}
```

Full sample: [`data/samples/vector_info.json`](data/samples/vector_info.json)

---

## Output Gallery

All images below were generated from real-world data using EarthForge example scripts. No synthetic or simulated data. Each output includes a `.txt` sidecar with alt text, data provenance, and generation metadata. See [`examples/outputs/`](examples/outputs/) for full details.

### Grand Canyon — DEM with Hillshade + Cross-Section

SRTM 30m elevation data via OpenTopography API. Shows 1,844m of relief from river to rim with elevation cross-section profile.

![Elevation map of the Grand Canyon from SRTM 30m DEM with cividis palette and hillshade overlay. Top panel shows terrain from 728m at river level to 2572m at the rim. A high-contrast orange cross-section line with black outline is drawn at the midpoint. Bottom panel shows the east-west elevation profile revealing the canyon's V-shaped depth.](examples/outputs/opentopo_grand_canyon_dem.png)

### Swiss Alps — Matterhorn/Zermatt Elevation Analysis

Copernicus DEM 30m via OpenTopography. Elevations from 1,868m (valley) to 4,330m (peaks) with statistics sidebar.

![Elevation map of the Swiss Alps near the Matterhorn and Zermatt from Copernicus DEM 30m. Viridis palette shows valleys at 1868m in dark purple and alpine peaks at 4330m in bright yellow. Hillshade overlay reveals glacial valleys and ridgelines. Statistics sidebar lists min, max, mean, median, and standard deviation.](examples/outputs/opentopo_swiss_alps_dem.png)

### Colorado Front Range — Sentinel-2 NDVI

Vegetation gradient from plains to alpine tundra, showing elevation-driven ecology. BrBG colorblind-safe diverging palette.

![NDVI map of the Colorado Front Range from Boulder to Rocky Mountain National Park. Brown-white-teal BrBG diverging palette shows urban and bare areas in brown, transitional zones in white, and dense montane forest in teal. The elevation-driven vegetation gradient is clearly visible from east (plains) to west (alpine).](examples/outputs/ndvi_colorado_front_range.png)

### Netherlands — Urban/Water/Vegetation NDVI

Sentinel-2 scene over Rotterdam/Delft showing water (NDVI < 0), urban (low NDVI), and agricultural areas (high NDVI).

![NDVI map of Rotterdam and Delft in the Netherlands showing three distinct land cover classes: water bodies with negative NDVI in brown, urban areas with low NDVI in light brown, and agricultural fields with high NDVI in teal. BrBG diverging palette. Water covers 7.8 percent, urban 18.9 percent, vegetation 63.3 percent of the scene.](examples/outputs/ndvi_netherlands_rotterdam.png)

### Amazon Rainforest — Tropical NDVI

Sentinel-2 scene near Manaus, Brazil showing dense tropical forest canopy with uniformly high NDVI.

![NDVI map of the Amazon rainforest near Manaus, Brazil from Sentinel-2 imagery. The dense tropical canopy shows uniformly high NDVI values (mean 0.42) in teal and dark teal. River channels and cleared areas appear in brown. BrBG colorblind-safe diverging palette.](examples/outputs/ndvi_amazon_manaus.png)

### Copernicus DEM — Elevation Statistics + Histogram

Raster statistics computed from a Copernicus DEM 30m tile with elevation distribution histogram. Viridis colorblind-safe palette.

![Elevation histogram and summary statistics from a Copernicus DEM 30m tile. Left panel shows the distribution of elevation values from 94m to 377m with viridis-colored bars. Right panel lists summary statistics: min 94m, max 377m, mean 216m, median 221m, std 43m, computed from 12.96 million valid pixels.](examples/outputs/raster_stats_dem_histogram.png)

### Yellowstone — Landsat STAC Search Footprints

Landsat Collection 2 Level-2 scene footprints from Earth Search, color-coded by cloud cover percentage.

![Map of 40 Landsat Collection 2 Level-2 scene footprints over Yellowstone National Park from a STAC search. Footprints are rectangles colored by cloud cover percentage using a reversed viridis palette where bright colors indicate low cloud cover and dark colors indicate high cloud cover.](examples/outputs/stac_landsat_yellowstone.png)

### Yosemite — Multi-Collection STAC Query

Two-panel figure querying both Sentinel-2 scenes and Copernicus DEM tiles from a single STAC API.

![Two-panel figure showing a multi-collection STAC query over Yosemite National Park. Left panel displays Sentinel-2 scene footprints colored by cloud cover. Right panel shows a Copernicus DEM elevation map with hillshade, elevations from 543m to 3547m in viridis palette.](examples/outputs/stac_multi_collection_yosemite.png)

### STAC-to-NDVI Pipeline

Complete pipeline workflow: STAC search, range-read Sentinel-2 bands, NDVI computation via safe expression evaluator, rendered output with pipeline summary.

![NDVI map produced by an automated STAC-to-NDVI pipeline workflow. Left panel shows the computed NDVI using BrBG diverging palette with brown for bare areas and teal for vegetation. Right panel lists the 4-step pipeline: STAC search, range-read B04 and B08 bands, NDVI computation, and render output, with NDVI statistics.](examples/outputs/pipeline_ndvi_output.png)

### Format Detection Matrix

EarthForge's three-stage format detection chain (magic bytes, extension, content inspection) tested across 12 geospatial file formats.

![Table showing EarthForge format detection results for 12 geospatial file types. Each row shows the file type, extension, detected format, detection method, and pass or miss status. Seven of twelve formats are correctly detected via magic bytes. Results use ColorBrewer Set2 palette with teal for PASS and orange for MISS.](examples/outputs/format_detection_matrix.png)

---

## What EarthForge Is Not

EarthForge is not a platform. It does not include a web server, a tile cache, a database, an ML pipeline, or a Kubernetes deployment. It is not a replacement for QGIS, ArcGIS, or Google Earth Engine. It does not try to be everything — it is a focused set of tools that integrate with existing workflows via structured output, stdin/stdout piping, and Python imports.

If you need a tile server, use [TiTiler](https://developmentseed.org/titiler/). If you need a STAC API, use [stac-fastapi](https://github.com/stac-utils/stac-fastapi). If you need a geospatial database, use PostGIS. EarthForge is the CLI toolkit you reach for alongside those tools, not instead of them.

## Install

```bash
# Full toolkit
pip install earthforge[all]

# Just what you need
pip install earthforge[stac]        # STAC discovery only
pip install earthforge[raster]      # COG operations only
pip install earthforge[vector]      # GeoParquet operations only
pip install earthforge[cube]        # Zarr datacube operations only
pip install earthforge[cli]         # CLI framework only
```

## Cloud Storage

EarthForge uses named profiles for cloud storage authentication, similar to AWS CLI profiles:

```bash
# Initialize config
earthforge config init

# Search with a specific profile
earthforge stac search sentinel-2-l2a --profile planetary
```

Profiles are defined in `~/.earthforge/config.toml`:

```toml
[profiles.default]
stac_api = "https://earth-search.aws.element84.com/v1"
storage = "s3"

[profiles.planetary]
stac_api = "https://planetarycomputer.microsoft.com/api/stac/v1"
storage = "azure"
```

## Architecture

EarthForge is built as a monorepo with independently installable workspace packages. The architecture is documented in detail — not as an afterthought, but as the foundation the implementation is built on.

- **[ARCHITECTURE.md](ARCHITECTURE.md)** — System design, dependency graph, module interfaces
- **[ai-dev/decisions/](ai-dev/decisions/)** — Architectural decision records with alternatives considered and tradeoffs acknowledged
- **[ai-dev/spec.md](ai-dev/spec.md)** — Requirements and acceptance criteria per milestone

Key architectural decisions:

| Decision | Record | Summary |
|---|---|---|
| Monorepo structure | [DL-001](ai-dev/decisions/DL-001-monorepo.md) | Single repo with Hatch workspace packages, not 15 separate repos |
| Async-first I/O | [DL-002](ai-dev/decisions/DL-002-async-first-io.md) | All network I/O is async via httpx; sync wrappers for convenience |
| obstore for storage | [DL-003](ai-dev/decisions/DL-003-storage-abstraction.md) | Rust-backed S3/GCS/Azure abstraction over fsspec |
| Rust extension boundary | [DL-005](ai-dev/decisions/DL-005-rust-boundary.md) | Rust for format detection and range reads; Python for everything else |
| Engineering credibility | [DL-006](ai-dev/decisions/DL-006-engineering-credibility.md) | Nothing ships empty; decisions before code; scope boundaries enforced |
| promptfoo evaluation | [DL-007](ai-dev/decisions/DL-007-promptfoo-eval.md) | Agent prompts and guardrails regression-tested in CI via promptfoo |

## Formats

| Format | Support | Operations |
|---|---|---|
| COG (Cloud Optimized GeoTIFF) | Full | info, validate, convert, preview, band math, tile |
| GeoParquet | Full | info, validate, convert, query, clip, tile |
| Zarr | Full | info, validate, convert, slice, stats |
| FlatGeobuf | Read/Write | info, validate, convert |
| COPC (Cloud Optimized Point Cloud) | Info | info |
| STAC (SpatioTemporal Asset Catalog) | Full | search, info, validate, fetch, publish |

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md). EarthForge has specific engineering standards — please read the contribution guide before opening a PR.

## Code of Conduct

See[CODE_OF_CONDUCT](CODE_OF_CONDUCT)

## Security

See [SECURITY](SECURITY)

## License

GNU General Public License v3.0. See [LICENSE](LICENSE).
