Metadata-Version: 2.4
Name: mldatafind
Version: 0.1.8
Summary: Luigi/Law Tasks for streamlining gravitational wave data discovery
Author-email: Ethan Marx <emarx@mit.edu>
License-Expression: MIT
License-File: LICENSE
Requires-Python: <3.13,>=3.10
Requires-Dist: boto3<2,>=1.34.4
Requires-Dist: cloudpathlib<0.19,>=0.18.1
Requires-Dist: gwpy>=3.0.12
Requires-Dist: htgettoken>=2.2
Requires-Dist: law<0.2,>=0.1.19
Requires-Dist: luigi<4,>=3.5.1
Description-Content-Type: text/markdown

# mldatafind
[`Law`](https://github.com/riga/law) workflows for streamling gravitational wave data discovery for ML applications

## Example
To run the [example configuration](./example.cfg), first build the container to your desired location

```console
export CONTAINER_PATH=/path/to/mldatafind.sif
apptainer build $CONTAINER_PATH apptainer.def
```

Next, the `Fetch` task, which will query science segments and strain data, can be run using local resources

```console
LAW_CONFIG_FILE=./example.cfg uv run law run mldatafind.law.tasks.Fetch --workflow local --local-scheduler --sandbox mldatafind::$CONTAINER_PATH
```

If you're on a machine with condor access like the LDG, the `Fetch` task can also trivially utilize condor resources by setting `--workflow htcondor` 

```console
LAW_CONFIG_FILE=./example.cfg uv run law run mldatafind.law.tasks.Fetch --workflow htcondor --local-scheduler --sandbox mldatafind::$CONTAINER_PATH
```

condor log files will be stored under the `condor_directory` argument of the `Fetch` task
