Metadata-Version: 2.1
Name: texrelenv
Version: 0.1.0
Summary: A Python package for generating data sets based on Hugh Perkins' TexRel (https://arxiv.org/abs/2105.12804)
License: MIT
Author: Nicholas Bailey
Author-email: nicholasbailey87@gmail.com
Requires-Python: >=3.12,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: numpy (>=2.2.0,<3.0.0)
Requires-Dist: toml (>=0.10.2,<0.11.0)
Description-Content-Type: text/markdown

# texrelenv: TexRel-like environments for emergent language experiments

A Python package for generating data sets based on [Hugh Perkins' TexRel](https://arxiv.org/abs/2105.12804).

![3 example images of the data generated by this library](example.png "Examples")

## Getting started

You can install this package from pypi using

```
pip install texrelenv
```

See example usage below for how to use the package to generate data.

## Example usage

Here is an example of how to generate 100 images based on some config options specified in a `config.toml` file.

Let's say you have a `config.toml` file that looks like the below. "grid" refers to the overall canvas of the image while "things" refers to the coloured objects that appear in images. The config options shown below are exhausted and are explained in the (**Config options**)[#Config options] section below

```
[things]
distinct_shapes = 9
distinct_colours = 9

[environment]
grid_size = 16
thing_size = 4
things_per_image = 5
hard_boundary = true
objects_can_overlap = false

[split]
hold_out_things = 0.2
hold_out_images = 0.0
```

A dataset can be created from the above config in the following way

```
from texrelenv import DataSet

data = DataSet(config_file="config.toml")
```

You could also pass all the config arguments as arguments when you instantiate the `DataSet` (they are all named the same).

Having instantiated the data set, you can generate some train images like this:

```
data.sample(100, 'train')
```

Or some test images like this:

```
data.sample(100, 'test')
```

## Config options

| Option   | Type | Meaning|
|----------|------|--------|
| **grid_size** | int  | The size of the square images generated, expressed as the length of the side of the square in pixels. Default is 16, for an easy drop-in replacement for MNIST-like data sets, use 28.   |
| **hard_boundary** | bool  | If true, shapes added to an image will always be fully visible. If false, they may partially be out of frame. |
| **objects_can_overlap** | bool  | Can objects go in front of or behind one another? Default is false.  |
| **thing_size** | int  | How big are the coloured objects added to images? Expressed as the length of one side of the square canvas used to create the templates for these objects, which can also be thought of as the maximum size of an object. |
| **distinct_shapes** | int  |  |
| **distinct_colours** | int  | blah   |
| **things_per_image** | int  | blah   |
| **rotate** | bool  | blah   |
| **flip** | bool  | blah   |
| **hold_out_things** | float  | blah   |
| **hold_out_images** | float  | blah   |

## Contributing

To contribute to the project, please make sure you have poetry installed and before you start working on the code, set up a virtual environment, activate it and then run

```
poetry install
```

to install all dependencies including dependencies only required for development. Then update precommit hooks with

```
pre-commit install
pre-commit autoupdate
```

Pre-commit will help you to keep your code style in keeping with the rest of the project.

Please write tests for your code, store them in `test/`, and only commit code that passes the unit tests that already exist. You can run unit tests with

```
python -m unittest
```

(precommit will not run unit tests!)

## Acknowledgements

We are hugely thankful to Hugh Perkins for [coming up with the idea of TexRel](https://arxiv.org/abs/2105.12804) in the first place, for creating the first version of PyTorch, and for doing various other cool stuff.

