Metadata-Version: 2.4
Name: netrias_client
Version: 0.0.1
Summary: Python client for the Netrias harmonization API
Project-URL: Homepage, https://github.com/netrias/netrias_client
Project-URL: Repository, https://github.com/netrias/netrias_client
Project-URL: Documentation, https://github.com/netrias/netrias_client#readme
Author-email: Chris Harman <charman@netrias.com>
License: MIT License
        
        Copyright (c) 2025 Netrias
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: api,cde,client,harmonization,netrias
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: boto3
Requires-Dist: httpx
Provides-Extra: dev
Requires-Dist: basedpyright; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=7; extra == 'dev'
Requires-Dist: python-dotenv>=1.0; extra == 'dev'
Requires-Dist: ruff>=0.5.0; extra == 'dev'
Requires-Dist: twine>=5.0; extra == 'dev'
Requires-Dist: ty; extra == 'dev'
Requires-Dist: typing-extensions; extra == 'dev'
Description-Content-Type: text/markdown

# Netrias Client

Python toolkit for working with the Netrias recommendation and harmonization services. The client wraps the HTTP APIs with strong typing, logging, and guard rails so analytics code can focus on describing data rather than orchestrating requests.

## Highlights
- **Stateful client facade** – instantiate `NetriasClient` and call `client.configure(...)` once.
- **Column discovery helpers** – derive column samples from CSV files, invoke the recommendation service, and normalize responses into `MappingDiscoveryResult` models.
- **Adapter utilities** – convert discovery output into harmonization-ready manifest payloads while applying confidence filters and CDE overrides.
- **Asynchronous harmonization loop** – submit jobs, poll for completion, download results, and version output files automatically to avoid accidental overwrites.
- **Extended timing logs** – discovery and harmonization emit duration metrics so you can spot slow calls quickly during live runs.

## Installation

The project targets Python 3.12+.

```bash
pip install netrias_client

# optional AWS helpers (gateway bypass)
pip install netrias_client[aws]
```

We recommend managing environments with [uv](https://github.com/astral-sh/uv):

```bash
# create or update a project that depends on netrias_client
uv add netrias_client

# install optional AWS helpers (gateway bypass)
uv add netrias_client[aws]
```

For local development within this repository:

```bash
uv sync --group dev              # install development tooling
uv sync --group aws --group dev  # include optional AWS dependencies
```

## Configuration

All client entry points require explicit configuration. Create a `NetriasClient`, then provide the API key; discovery and harmonization endpoints remain fixed by the library.

```python
from pathlib import Path

from netrias_client import NetriasClient
from netrias_client._models import LogLevel

client = NetriasClient()
client.configure(
    api_key="<netrias api key>",
    # Optional overrides:
    timeout=21600.0,               # seconds (default: 6 hours)
    log_level=LogLevel.INFO,
    confidence_threshold=0.80,     # discovery adapter filter, 0.0–1.0
    discovery_use_gateway_bypass=True,  # toggle Lambda bypass (default: True)
    log_directory=Path("logs/netrias"),  # optional per-client log files
)
```

Configuration errors raise `ClientConfigurationError`. Calling `configure` again replaces the active settings snapshot and reinitializes the dedicated logger (refreshing file handlers when `log_directory` is supplied).

## End-to-End Workflow

The typical harmonization flow contains three steps:

```python
from pathlib import Path

from netrias_client import NetriasClient

client = NetriasClient()
client.configure(api_key="<netrias api key>")

csv_path = Path("/path/to/source.csv")
schema = "ccdi"

# 1. Ask the recommendation service for potential targets.
manifest_payload = client.discover_mapping_from_csv(
    source_csv=csv_path,
    target_schema=schema,
)

# 2. Kick off harmonization directly with the manifest payload.
result = client.harmonize(source_path=csv_path, manifest=manifest_payload)
print(result.status)
print(result.description)
print(result.file_path)
```

- `client.discover_mapping_from_csv(...)` samples up to 25 values per column (configurable), calls the API, and returns a manifest-ready payload (including static metadata such as CDE routes/IDs where configured).
- `client.harmonize(...)` submits a job and polls `GET /v1/jobs/{jobId}` until the backend returns success or failure. Downloaded CSVs are written next to the source file (versioned as `data.harmonized.v1.csv`, etc.). Pass `manifest_output_path=` if you also want to persist the manifest JSON for inspection.

### Timing Logs

Both discovery and harmonization log elapsed seconds for the full operation and for timeout/transport failures. Sample output:

```
INFO netrias_client: discover mapping start: schema=ccdi columns=12
INFO netrias_client: discover mapping complete: schema=ccdi suggestions=0 duration=47.12s
INFO netrias_client: harmonize start: file=data.csv
INFO netrias_client: harmonize finished: file=data.csv status=succeeded duration=182.45s
```

Use these metrics to separate slow API responses from downstream processing overhead.

## Adapter Notes

Discovery results are normalized to manifest payloads automatically; unmatched columns are logged so you can expand coverage. Confidence thresholds come from `configure(confidence_threshold=...)` and default to 0.8.

## Gateway Bypass (Temporary)

The module `netrias_client._gateway_bypass` exposes `invoke_cde_recommendation_alias(...)`, a stopgap helper that calls the `cde-recommendation` Lambda alias directly. This avoids API Gateway’s short timeout window but requires AWS credentials with `lambda:InvokeFunction` permission and the `boto3` dependency.

```python
from netrias_client._gateway_bypass import invoke_cde_recommendation_alias

result = invoke_cde_recommendation_alias(
    target_schema="ccdi",
    columns={"study_name": ["foo", "bar"]},
    alias="prod",
    region_name="us-east-2",
)
```

Install `boto3` (or `netrias-client[aws]` if provided) before importing the bypass module, and rotate IAM credentials frequently. Once API Gateway limits are raised, prefer the standard discovery flow again.

## Testing & Tooling

The repository ships with pytest-based integration tests plus lint/type tooling.


```bash
uv run pytest
uv run ruff check
uv run basedpyright
uv build                 # produce wheel + sdist
```

Live verification scripts are located under `live_test/` and require a populated `.env` file containing `NETRIAS_API_KEY` (and optionally harmonization overrides while services converge).

## Project Layout

```
src/netrias_client/
    __init__.py          # re-exported public surface
    _adapter.py          # discovery → manifest conversion
    _client.py           # NetriasClient facade and state management
    _config.py           # settings validation helpers
    _core.py             # harmonization workflow
    _discovery.py        # discovery wrappers and CSV sampling
    _errors.py           # exception taxonomy
    _http.py             # HTTP primitives (submit/poll/download)
    _io.py               # streaming helpers
    _logging.py          # standardized logger setup
    _models.py           # dataclasses for structured responses
    _validators.py       # filesystem and payload validation
```

Tests reside under `src/netrias_client/tests/` and are excluded from the published wheel to keep installs slim; run them locally via `uv run pytest`.

## Contributing

1. `uv sync --group dev` (add `--group aws` if needed) to create the virtual environment.
2. `uv run pytest` to ensure the suite passes prior to committing.
3. Follow the repo conventions: keep functions focused, prefer typed interfaces, and favor logging key transitions over verbose chatter.

Pull requests should include updated documentation or fixtures when they alter API behavior or the manifest contract.
