Metadata-Version: 2.1
Name: xpdt
Version: 0.1.0
Summary: eXPeditious Data Transfer
Home-page: https://github.com/giannitedesco/xpdt
Author: Gianni Tedesco
Author-email: gianni@scaramanga.co.uk
License: GPLv3
Platform: any
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Operating System :: POSIX
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: OS Independent
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: Implementation :: CPython
Description-Content-Type: text/markdown
License-File: LICENSE.txt

# xpdt: eXPeditious Data Transfer

<div align="center">
  <img src="https://img.shields.io/pypi/v/xpdt?label=pypi" alt="PyPI version">
</div>

## About
xpdt is (yet another) language for defining data-types and generating code for
serializing and deserializing them. It aims to produce code with little or no
overhead and is based on fixed-length representations which allows for
zero-copy deserialization and (at-most-)one-copy writes (source to buffer).

The generated C code, in particular, is highly optimized and often permits the
elimination of data-copying for writes and enables optimizations such as
loop-unrolling for fixed-length objects. This can lead to read speeds in
excess of 500 million objects per second (~1.8 nsec per object).

## Examples
The xpdt source language looks similar to C struct definitions:

```
struct timestamp {
	u32	tv_sec;
	u32	tv_nsec;
};

struct point {
	i32	x;
	i32	y;
	i32	z;
};

struct line {
	timestamp	time;
	point		line_start;
	point		line_end;
	bytes		comment;
};
```

Fixed width integer types from 8 to 128 bit are supported, along with the
`bytes` type, which is a variable-length sequence of bytes.

## Target Languages
The following target languages are currently supported:
- C
- Python

The C code is very highly optimized.

The Python code is about as well optimized for CPython as I can make it. It
uses typed `NamedTuple` for objects, which has some small overhead over regular
tuples, and it uses `struct.Struct` to do the packing/unpacking. I have also
code-golfed the generated bytecodes down to what I think is minimal given the
design constraints. As a result, performance of the pure Python code is
comparable to a JSON library implemented in C or Rust.

For better performance in Python, it may be desirable to develop a Cython
target. In some instances CFFI structs may be more performant since they can
avoid the creation/destruction of an object for each record.

Target languages are implemented purely as `jinja2` templates.

## Serialization format
The serialization format for fixed-length objects is simply a packed C struct.

For any object which contains `bytes` type fields:
- a 32bit unsigned record length is prepended to the struct
- all `bytes` type fields are converted to `u32` and contain the length of the bytes
- all bytes contents are appended after the struct in the order in which they appear

For example, following the example above, the serialization would be:

```
u32 tot_len # = 41
u32 time.tv_sec
u32 time.tv_usec
i32 line_start.x
i32 line_start.y
i32 line_start.z
i32 line_end.x
i32 line_end.y
i32 line_end.z
u32 comment # = 5
u8 'H'
u8 'e'
u8 'l'
u8 'l'
u8 'o'
```

## Features
The feature-set is, as of now, pretty slim.

There are no array / sequence / map types, and no keyed unions.

Support for such things may be added in future provided that suitable
implementations exist. An implementation is suitable if:
- It admits a zero (or close to zero) overhead implementation
- it causes no overhead when the feature isn't being used

# License
The compiler is released under the GPLv3.

The C support code/headers are released under the MIT license.

The generated code is yours.
