Metadata-Version: 2.1
Name: sophios
Version: 0.1.0
Summary: DSL for inferring the edges of a CWL workflow DAG
Author-email: Jake Fennick <jake.fennick@axleinfo.com>
Project-URL: Homepage, https://github.com/PolusAI/workflow-inference-compiler
Project-URL: Bug Tracker, https://github.com/PolusAI/workflow-inference-compiler/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: graphviz
Requires-Dist: jsonschema<4.18
Requires-Dist: pyyaml
Requires-Dist: requests
Requires-Dist: mergedeep
Requires-Dist: networkx
Requires-Dist: cwl-utils>=0.32
Requires-Dist: typeguard
Requires-Dist: pydantic>=2.6
Requires-Dist: docker
Requires-Dist: podman
Requires-Dist: toil[cwl]
Provides-Extra: test
Requires-Dist: pre-commit; extra == "test"
Requires-Dist: py; extra == "test"
Requires-Dist: pytest; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: pytest-parallel; extra == "test"
Requires-Dist: coverage; extra == "test"
Requires-Dist: mypy; extra == "test"
Requires-Dist: numpy; extra == "test"
Requires-Dist: scipy; extra == "test"
Requires-Dist: pylint; extra == "test"
Requires-Dist: autopep8; extra == "test"
Requires-Dist: pre-commit; extra == "test"
Requires-Dist: hypothesis; extra == "test"
Requires-Dist: hypothesis-jsonschema; extra == "test"
Provides-Extra: mypy-types
Requires-Dist: lxml-stubs; extra == "mypy-types"
Requires-Dist: types-Pillow; extra == "mypy-types"
Requires-Dist: types-PyYAML; extra == "mypy-types"
Requires-Dist: types-Pygments; extra == "mypy-types"
Requires-Dist: types-colorama; extra == "mypy-types"
Requires-Dist: types-decorator; extra == "mypy-types"
Requires-Dist: types-docutils; extra == "mypy-types"
Requires-Dist: types-html5lib; extra == "mypy-types"
Requires-Dist: types-jsonschema; extra == "mypy-types"
Requires-Dist: types-psutil; extra == "mypy-types"
Requires-Dist: types-python-jose; extra == "mypy-types"
Requires-Dist: types-pytz; extra == "mypy-types"
Requires-Dist: types-redis; extra == "mypy-types"
Requires-Dist: types-requests; extra == "mypy-types"
Requires-Dist: types-setuptools; extra == "mypy-types"
Requires-Dist: types-six; extra == "mypy-types"
Requires-Dist: types-urllib3; extra == "mypy-types"
Provides-Extra: runners
Requires-Dist: toil[cwl]; extra == "runners"
Requires-Dist: cwl-utils; extra == "runners"
Provides-Extra: doc
Requires-Dist: sphinx; extra == "doc"
Requires-Dist: myst-parser; extra == "doc"
Requires-Dist: sphinx-autodoc-typehints; extra == "doc"
Provides-Extra: plots
Requires-Dist: matplotlib; extra == "plots"
Provides-Extra: cyto
Requires-Dist: ipycytoscape; extra == "cyto"
Provides-Extra: all-except-runner-src
Requires-Dist: sophios[cyto,doc,mypy-types,plots,test]; extra == "all-except-runner-src"

# Workflow Inference Compiler

[![doc-buid-status](https://readthedocs.org/projects/workflow-inference-compiler/badge/?version=latest)](https://workflow-inference-compiler.readthedocs.io/en/latest/)

Scientific computing can be difficult in practice due to various complex software issues. In particular, chaining together software packages into a computational pipeline can be very error prone. Using the [Common Workflow Language](https://www.commonwl.org) (CWL) greatly helps, but like many other workflow languages users still need to explicitly specify how to connect inputs & outputs. The Workflow Inference Compiler allows users to specify computational protocols at a very high level of abstraction, it automatically infers almost all connections between inputs & outputs, and it compiles to CWL for execution.

## Documentation
The documentation is available on [readthedocs](https://workflow-inference-compiler.readthedocs.io/en/latest/).
## Example Workflows
The following repositories contain example workflows:

[Molecular Modeling Workflows](https://github.com/PolusAI/mm-workflows)

[Image Workflows](https://github.com/PolusAI/image-workflows)

Like CWL, the compiler is general purpose and is not limited to any specific domain.
You do not need to install these to use wic. They are completely optional.

(But obviously if you're just getting started and you don't have any workflows of your own, you probably want to install at least one of them.)
## Quick Start
See the [installation guide](docs/installguide.md) for more details, but:

For pip users:

`pip install wic` # Please read the next sentence

Unlike conda, **pip cannot install the binary system dependencies needed to actually run most workflows!**

If you want to actually run workflows, you (or your sysadmin) will have to manually install and configure additional software!

For conda users / developers:

See the [installation guide for developers](docs/dev/installguide.md)

```
wic --yaml ../workflow-inference-compiler/docs/tutorials/helloworld.wic --graphviz --run_local --quiet
```

The Workflow Inference Compiler is a [Domain Specific Language](https://en.wikipedia.org/wiki/Domain-specific_language) (DSL) based on the [Common Workflow Language](https://www.commonwl.org). CWL is fantastic, but explicitly constructing the Directed Acyclic Graph (DAG) associated with a non-trivial workflow is not so simple. Instead of writing raw CWL, you can write your workflows in a much simpler yml DSL. For technical reasons edge inference is far from unique, so ***`users should always check that edge inference actually produces the intended DAG`***.

## Edge Inference

The key feature is that in most cases, you do not need to specify any of the edges! They will be automatically inferred for you based on types, file formats, and naming conventions. For more information, see the [user guide](docs/userguide.md#edge-inference-algorithm) If for some reason edge inference fails, there is a syntax for creating [explicit edges](docs/userguide.md#explicit-edges).

## Subworkflows

Subworkflows are very useful for creating reusable, composable building blocks. As shown above, recursive subworkflows are fully supported, and the edge inference algorithm has been very carefully constructed to work across subworkflow boundaries.

## Explicit CWL

Since the yml DSL files are automatically compiled to CWL, users should not have to know any CWL. However, the yml DSL is secretly CWL that is simply missing almost all of the tags! In other words, the compiler merely adds missing information to the files, and so if you know CWL you are free to explicitly add the information yourself. Thus, the yml DSL is intentionally a [leaky abstraction](https://en.wikipedia.org/wiki/Leaky_abstraction).

## Python API
In addition to the underlying declarative yaml syntax, there is an API for writing WIC workflows in python. The python API is philosophically the exact opposite: users should not have to know any CWL, and in fact all CWL features are hidden unless explicitly exposed.
