Metadata-Version: 2.4
Name: confingy
Version: 0.1.2
Summary: An implicit configuration system
Author-email: RunwayML <dev-feedback@runwayml.com>
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: pydantic>=2
Requires-Dist: typer>=0.19.0
Requires-Dist: typing-extensions>=4.6.0
Provides-Extra: viz
Requires-Dist: fastapi>=0.100.0; extra == 'viz'
Requires-Dist: python-multipart>=0.0.6; extra == 'viz'
Requires-Dist: requests>=2.31.0; extra == 'viz'
Requires-Dist: uvicorn>=0.23.0; extra == 'viz'
Description-Content-Type: text/markdown

<p align="center">
  <img src="docs/assets/confingy.svg" alt="description">
</p>

<div align="center">
  <h1>confingy</h1>
    <a href="https://pypi.org/project/confingy/"><img src="https://img.shields.io/pypi/v/confingy" alt="PyPI"></a>
    <a href="https://github.com/runwayml/confingy/actions/workflows/pytest.yml"><img src="https://github.com/runwayml/confingy/actions/workflows/pytest.yml/badge.svg" alt="Tests"></a>
    <a href="https://runwayml.github.io/confingy/"><img src="https://img.shields.io/badge/docs-mkdocs-blue" alt="Docs"></a>
    <a href="https://github.com/runwayml/confingy/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-blue" alt="License"></a>
  </p>
</div>


---


`confingy` is an _implicit configuration system_ for Python. It was built as an attempt to break out of configuration hell by bringing your configuration and code together, at last.

`confingy` is primarily geared towards iterative or experimental works, such as training machine learning (ML) / AI models, via 3 core features:

1. Track the constructor arguments for arbitrary Python classes. No need to define configuration objects or YAML files.
2. Lazily-instantiate any tracked class. Wait until your model is on the cluster before allocating all that memory.
3. Serialize tracked classes to JSON and deserialize back into Python. Reproducibility and lineage are byproduct of tracking.

The above features also apply to standard python objects like dictionaries and dataclasses, as well as dependency-injected tracked classes, so you can package up an entire deep learning job into a single class rather than in a mess of YAML files that tell your job runner what to do.

## Installation

Install from PyPI with either `pip` or `uv`
```bash
uv add confingy
# OR
pip install confingy
```

If you're using mypy, then add the mypy plugin `confingy.mypy_plugin` to whichever type of config file you're using:

pyproject.toml

```toml
[tool.mypy]
plugins = ["confingy.mypy_plugin"]
```

mypy.ini

```ini
[mypy]
plugins = confingy.mypy_plugin
```

## Quick Start

### Tracking

The arguments to any class' constructor may be tracked with `confingy.track`, as long as the arguments themselves are trackable (tracked classes and most stdlib python objects are supported). The arguments will be stored in a private `_tracked_info` attribute.

```python
import random

from confingy import track


@track
class MyDataset:
    def __init__(self, size: int):
        self.size = size
        self.data = [random.random() for _ in range(size)]

    def __getitem__(self, idx: int) -> float:
        return self.data[idx]

size_10 = MyDataset(10)
print(size_10._tracked_info)
# {'class': 'MyDataset', 'module': '__main__', 'init_args': {'size': 10}, 'class_hash': 'f8fa2463f8c366f0292f538903a24ed57968a05c9e36bfccbf7147daa77a65ae'}
size_20 = MyDataset(20)
print(size_20._tracked_info)
# {'class': 'MyDataset', 'module': '__main__', 'init_args': {'size': 20}, 'class_hash': 'f8fa2463f8c366f0292f538903a24ed57968a05c9e36bfccbf7147daa77a65ae'}
```

Any object instantiated from a tracked class can be serialized to a standard confingy JSON "fingy":

```python
from confingy import serialize_fingy

print(serialize_fingy(size_10))
# {'_confingy_class': 'MyDataset', '_confingy_module': '__main__', '_confingy_init': {'size': 10}, '_confingy_class_hash': 'f8fa2463f8c366f0292f538903a24ed57968a05c9e36bfccbf7147daa77a65ae'}
```

You can then deserialize your fingy back to python

```python
from confingy import deserialize_fingy

print(deserialize_fingy(serialize_fingy(size_10)))
# <__main__.MyDataset object at 0x7f2830a5b8b0>
```

To save and load fingys directly to/from JSON files, use `save_fingy` and `load_fingy`:

```python
from confingy import save_fingy, load_fingy

save_fingy(size_10, "my_dataset.json")
loaded = load_fingy("my_dataset.json")
```

### Composability

You can chain, nest, and dependency inject tracked classes and python types in order to gain lineage and reproducibility.

By packaging up your entire job into a dataclass that consists of classes that take other classes as arguments (a la dependency injection), you end up with a graph of sorts that can be fully serialized and deserialized.

```python
import json
from dataclasses import dataclass

from confingy import track, serialize_fingy


@track
class DataFetcher:
    def __init__(self, start: str, end: str):
        pass

@track
class DataLoader:
    def __init__(self, fetcher: DataFetcher, batch_size: int):
        pass

@track
class Model:
    def __init__(self, hyperparameter: float):
        pass

@track
class Ensemble:
    def __init__(self, models: list[Model]):
        pass

@dataclass
class Job:
    dataloader: DataLoader
    model: Ensemble | Model

job = Job(
    dataloader=DataLoader(DataFetcher("2026-01-01", "2026-01-31"), 32),
    model=Ensemble([Model(1.0), Model(2.0)])
)

serialized = serialize_fingy(job)
print(json.dumps(serialized, indent=2))
```

<details>
<summary>Serialized JSON output</summary>

```json
{
  "_confingy_class": "Job",
  "_confingy_module": "__main__",
  "_confingy_dataclass": true,
  "_confingy_fields": {
    "dataloader": {
      "_confingy_class": "DataLoader",
      "_confingy_module": "__main__",
      "_confingy_init": {
        "fetcher": {
          "_confingy_class": "DataFetcher",
          "_confingy_module": "__main__",
          "_confingy_init": {
            "start": "2026-01-01",
            "end": "2026-01-31"
          },
          "_confingy_class_hash": "cab..."
        },
        "batch_size": 32
      },
      "_confingy_class_hash": "3b3..."
    },
    "model": {
      "_confingy_class": "Ensemble",
      "_confingy_module": "__main__",
      "_confingy_init": {
        "models": [
          {
            "_confingy_class": "Model",
            "_confingy_module": "__main__",
            "_confingy_init": { "hyperparameter": 1.0 },
            "_confingy_class_hash": "24a..."
          },
          {
            "_confingy_class": "Model",
            "_confingy_module": "__main__",
            "_confingy_init": { "hyperparameter": 2.0 },
            "_confingy_class_hash": "24a..."
          }
        ]
      },
      "_confingy_class_hash": "7e8..."
    }
  }
}
```

</details>

### Lazy Instantiation

Sometimes you may have large classes that you want to use, but you don't want to instantiate them when the class constructor is called. For example, maybe you want to create a config for you disributed training job, but you don't want to instantiate the model until you're on the distributed node. For these classes, you decorate them with `@track` and then call a `.lazy()` classmethod to construct a lazy object of type `Lazy[T]` for class `T`.

```python
from confingy import Lazy

@track
class MyModel:
    def __init__(self, num_layers: int):
        self.num_layers = num_layers
        self.expensive_initialization()

    def expensive_initialization(self):
        print("Creating a bunch of tensors...")


lazy_model = MyModel.lazy(1_000)
print(lazy_model)
# Lazy<MyModel>(lazy, config={'num_layers': 1000})

# Alternatively, you can do the following, even if the class is not wrapped with @track
# lazy_model = confingy.lazy(MyModel)(1_000)

# Inspect the configuration without instantiating
print(lazy_model.get_config())
# {'num_layers': 1000}


@dataclass
class LazyTrainingConfig:
    dataset: MyDataset
    model: Lazy[MyModel]


lazy_training_config = LazyTrainingConfig(
    dataset=MyDataset(100),
    model=lazy_model
)

# Instantiate the model
model = lazy_training_config.model.instantiate()
# Creating a bunch of tensors...

print(model)
# <__main__.MyModel object at 0x105c59b10>
```

### Validation

A benefit of using `@track` is that you get [Pydantic](https://docs.pydantic.dev/latest/) validation for free. This means that you can catch runtime type errors like

```python
from confingy import track


@track
class Foo:
    def __init__(self, a_string: str):
        self.a_string = a_string

# This raises an error
Foo(1.0)
# ValidationError: Validation failed for Foo:
#   • Field 'a_string': Input should be a valid string (got 1.0)
```

This also works when using `confingy.lazy()`

```python
from confingy import lazy

# This raises an error
lazy(Foo)(1.0)
```

For validating configuration dataclasses, you can use [Pydantic dataclasses](https://docs.pydantic.dev/latest/concepts/dataclasses/) in order to get validation. Note the usage of `arbitrary_types_allowed=True` to support custom classes in the dataclass.

```python
from pydantic.dataclasses import dataclass
from pydantic import ConfigDict

from confingy import track


@track
class Foo:
    def __init__(self, a_string: str):
        self.a_string = a_string


@track
class Bar:
    def __init__(self, an_int: int):
        self.an_int = an_int


@dataclass(config=ConfigDict(arbitrary_types_allowed=True))
class MyConfig:
    foo: Foo
    string_list: list[str]


# This works
MyConfig(Foo("a string"), ["a", "b"])
# This raises a pydantic validation error
MyConfig(Foo("a string"), [1.0, 2.0])
# We can also raise errors based on custom classes
MyConfig(Bar(1), ["a", "b"])

```

## Why?

We always end up building giant configuration objects/files for machine learning projects. We then have to create interfaces for converting the configuration into python.

For example, you have probably seen this pattern before:

```python
from dataclasses import dataclass
# Function to dynamically load a class based on its string path.
from my_lib import load_class

class Foo:
    def __init__(self, bar: int, baz: str):
        self.bar = bar
        self.baz = baz


@dataclass
class MyConfig:
    config_class: str
    config_kwargs: dict


# No type-hint validation.
config = MyConfig(config_class="Foo", config_kwargs={"bar": 1, "baz": "hello"})

# Both you and the IDE have no idea what type of object `foo` is.
foo = load_class(config.config_class)(**config.config_kwargs)

```

This pattern is painful since we invariably end up having to create interfaces that take in a config object and then figure out how to dynamically instantiate python classes from that object. Additionally, this pattern often encourages inheritance over composition and pushes us away from dependency injection.

Why do we implement this pattern? Because this pattern allows us to use custom-classes from our library, track everything in a reproducible config object, and lazy-instantiate classes that may be too costly to instantiate when we're defining our `config`.

Ideally, we could just use python, avoid interfaces, and keep our IDE happy:

```python

@dataclass
class MyConfig:
    my_obj: Foo


config = MyConfig(my_obj=Foo(1, "baz"))
```

`confingy` aims to do just this, without losing any of the benefits of the prior approach.

## API Quick Reference

| Function / Class | Description |
|------------------|-------------|
| `@track` | Decorator to track constructor arguments |
| `lazy()` | Create a lazy instance of a class |
| `Lazy[T]` | Type hint for lazy objects |
| `lens()` | Convert tracked/lazy objects for deep modifications |
| `serialize_fingy()` | Serialize a fingy to a dict |
| `deserialize_fingy()` | Deserialize a dict back to Python |
| `save_fingy()` | Save a fingy to a JSON file |
| `load_fingy()` | Load a fingy from a JSON file |
| `transpile_fingy()` | Convert serialized fingy to Python code |
| `prettify_serialized_fingy()` | Make serialized fingy human-readable |
| `disable_validation()` | Context manager to skip Pydantic validation |

**Lazy methods:** `.instantiate()`, `.get_config()`, `.copy()`, `.unlens()`

**Exceptions:** `ValidationError`, `SerializationError`, `DeserializationError`

**CLI:** `confingy serialize`, `confingy transpile`, `confingy viz`

## Development

### Releasing a new version

1. Create a branch and bump the `version` in `pyproject.toml`
2. Run `uv sync --group dev --extra viz` to update the lockfile
3. Open a PR, get it reviewed, and merge to `main`
4. Create a GitHub release with a tag matching the version (e.g. `v0.1.1`):
   ```bash
   gh release create v0.1.1 --title "v0.1.1" --generate-notes
   ```
   The [publish workflow](.github/workflows/publish.yml) will run tests, linting, and type checking, then build and publish to PyPI
