Metadata-Version: 2.4
Name: xzarrguard
Version: 0.1.2
Summary: Integrity checks and creation helpers for Zarr v3 stores
Project-URL: Homepage, https://github.com/j-haacker/xzarrguard
Project-URL: Repository, https://github.com/j-haacker/xzarrguard
Project-URL: Issues, https://github.com/j-haacker/xzarrguard/issues
Project-URL: Documentation, https://j-haacker.github.io/xzarrguard/
Author-email: Jan Haacker <152862650+j-haacker@users.noreply.github.com>
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.12
Requires-Dist: fsspec>=2024.9.0
Requires-Dist: xarray>=2024.9.0
Requires-Dist: zarr>=3.0.0
Provides-Extra: dev
Requires-Dist: build>=1.2.2; extra == 'dev'
Requires-Dist: pre-commit>=4.0.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.12.0; extra == 'dev'
Requires-Dist: tox>=4.0.0; extra == 'dev'
Requires-Dist: twine>=5.1.1; extra == 'dev'
Requires-Dist: zensical; extra == 'dev'
Provides-Extra: s3
Requires-Dist: s3fs>=2024.9.0; extra == 's3'
Description-Content-Type: text/markdown

# xzarrguard

`xzarrguard` solves the ambiguity of interpreting missing chunk files as `NaN`, and provides concise APIs and a CLI to validate completeness of Zarr v3 stores, create local stores with explicit no-data policy, and convert between manifest/materialized no-data representations.

## Install

**PyPI**: `pip install xzarrguard`  
**PyPI + S3 support**: `pip install "xzarrguard[s3]"`  
**conda**: `conda install xzarrguard`  
**from source**: `pip install .`  

## Install-free CLI usage

**uv**: `uvx xzarrguard check /path/to/store.zarr`  
**pixi**: `pixi exec xzarrguard check /path/to/store.zarr`

Remote `check` uses fsspec backends. For S3-compatible stores:

```bash
xzarrguard check "s3://example-bucket/path/to/store.zarr" \
  --profile example-profile \
  --endpoint-url "https://object-store.example.com"
```

## API quickstart

```python
from xzarrguard import check_store, create_store

report = check_store("store.zarr")
if report:
    print("store is complete")
```

```python
remote_report = check_store(
    "s3://example-bucket/path/to/store.zarr",
    storage_options={
        "profile": "example-profile",
        "client_kwargs": {"endpoint_url": "https://object-store.example.com"},
    },
)
```

```python
create_store(
    dataset,
    "store.zarr",
    no_data_chunks={"temperature": [(0, 0)]},
    no_data_strategy="manifest",
)
```

Write and guard in one step (wrapper around `.to_zarr()`):

```python
from xzarrguard import guarded_to_zarr

guarded_to_zarr(dataset, "store.zarr")
```

Recommended distributed-write workflow:

1. Use upstream `xarray.Dataset.to_zarr(..., write_empty_chunks=True)` during the distributed write phase so workers materialize chunk keys deterministically.
2. Finalize with `xzarrguard` conversion to derive compact manifests from no-data chunks.

```python
from xzarrguard import convert_store

convert_store("store.zarr", direction="auto")
```

In-place metadata-only guard update (no chunk rewrite):

```python
create_store(
    None,
    "store.zarr",
    no_data_chunks={"temperature": [(0, 0)]},
    in_place_metadata_only=True,
)
```

Treat the current store as baseline and derive allowed-missing chunks from
what is currently missing:

```python
create_store(
    None,
    "store.zarr",
    in_place_metadata_only=True,
    infer_no_data_from_store=True,
)
```

## CLI quickstart

```bash
xzarrguard check store.zarr
xzarrguard check "s3://example-bucket/path/to/store.zarr" --profile example-profile --endpoint-url "https://object-store.example.com"
xzarrguard create source.zarr target.zarr --no-data no_data.json
xzarrguard create store.zarr --in-place-metadata-only --no-data no_data.json
xzarrguard create store.zarr --in-place-metadata-only --infer-no-data-from-store
xzarrguard convert store.zarr
xzarrguard convert store.zarr --direction manifest_to_materialized
```

Note: you may see `ZarrUserWarning: Object at .xzarrguard is not recognized as a component of a Zarr hierarchy.` when tooling walks the store hierarchy. This is expected: `.xzarrguard/` is xzarrguard sidecar metadata, not a Zarr array/group node.

## Coverage

```bash
pytest
```

`pytest` prints terminal coverage and writes `coverage.xml`.

## Documentation

https://j-haacker.github.io/xzarrguard/

```bash
zensical serve
zensical build --clean
```

## Channels

- PyPI: https://pypi.org/project/xzarrguard/
- Conda-forge: https://anaconda.org/conda-forge/xzarrguard
- GitHub: https://github.com/j-haacker/xzarrguard

## Release (maintainers)

```bash
# bump src/xzarrguard/_version.py first
python -m build
python -m twine check dist/*
python -m twine upload dist/*
```

Use a PyPI API token for upload (for example `TWINE_USERNAME=__token__`).
For conda-forge, update `recipe/recipe.yaml` after the PyPI release (fixed version + PyPI sdist URL + sha256), then submit a recipe/feedstock PR.

Acknowledgement: Initial scaffolding and implementation assistance by OpenAI Codex.
