Metadata-Version: 2.1
Name: positional-encodings
Version: 2.0.0
Summary: 1D, 2D, and 3D Sinusodal Positional Encodings in PyTorch
Home-page: https://github.com/tatp22/multidim-positional-encoding
Author: Peter Tatkowski
Author-email: tatp22@gmail.com
License: UNKNOWN
Description: # 1D, 2D, and 3D Sinusodal Postional Encoding Pytorch
        
        This is an implemenation of 1D, 2D, and 3D sinusodal positional encoding, being
        able to encode on tensors of the form `(batchsize, x, ch)`, `(batchsize, x, y,
        ch)`, and `(batchsize, x, y, z, ch)`, where the positional encodings will be
        added to the `ch` dimension. The [Attention is All You
        Need](https://arxiv.org/pdf/1706.03762.pdf) allowed for positional encoding in
        only one dimension, however, this works to extend this to 2 and 3 dimensions.
        
        New: This also works on tensors of the form `(batchsize, x, ch)`, etc. For
        inputs of this type, include the word `Permute` before the number in the class;
        e.g. for a 1D input of size `(batchsize, x, ch)`, do
        `PositionalEncodingPermute1D` instead of `PositionalEncoding1D`.
        
        To install, simply run:
        
        ```
        pip install positional-encodings
        ```
        
        Specifically, the formula for inserting the positional encoding will be as follows:
        
        1D:
        ```
        PE(x,2i) = sin(x/10000^(2i/D))
        PE(x,2i+1) = cos(x/10000^(2i/D))
        
        Where:
        x is a point in 2d space
        i is in [0, D/2), where D is the size of the ch dimension
        ```
        
        2D:
        ```
        PE(x,y,2i) = sin(x/10000^(4i/D))
        PE(x,y,2i+1) = cos(x/10000^(4i/D))
        PE(x,y,2j+D/2) = sin(y/10000^(4j/D))
        PE(x,y,2j+1+D/2) = cos(y/10000^(4j/D))
        
        Where:
        (x,y) is a point in 2d space
        i,j is in [0, D/4), where D is the size of the ch dimension
        ```
        
        3D:
        ```
        PE(x,y,z,2i) = sin(x/10000^(6i/D))
        PE(x,y,z,2i+1) = cos(x/10000^(6i/D))
        PE(x,y,z,2j+D/3) = sin(y/10000^(6j/D))
        PE(x,y,z,2j+1+D/3) = cos(y/10000^(6j/D))
        PE(x,y,z,2k+2D/3) = sin(z/10000^(6k/D))
        PE(x,y,z,2k+1+2D/3) = cos(z/10000^(6k/D))
        
        Where:
        (x,y,z) is a point in 3d space
        i,j,k is in [0, D/6), where D is the size of the ch dimension
        ```
        
        This is just a natural extension of the 2D positional encoding used in [this](https://arxiv.org/pdf/1908.11415.pdf) paper.
        
        Don't worry if the input is not divisible by 2 (1D), 4 (2D), or 6 (3D); all the necessary padding will be taken care of.
        
        ## Usage:
        
        ```python3
        import torch
        from positional_encodings import PositionalEncoding1D, PositionalEncoding2D, PositionalEncoding3D
        
        p_enc_1d = PositionalEncoding1D(10)
        x = torch.zeros((1,6,10))
        print(p_enc_1d(x).shape) # (1, 6, 10)
        
        p_enc_2d = PositionalEncoding2D(8)
        y = torch.zeros((1,6,2,8))
        print(p_enc_2d(y).shape) # (1, 6, 2, 8)
        
        p_enc_3d = PositionalEncoding3D(11)
        z = torch.zeros((1,5,6,4,11))
        print(p_enc_3d(z).shape) # (1, 5, 6, 4, 11)
        ```
        
        And for tensors of the form `(batchsize, ch, x)`, etc:
        
        ```python3
        import torch
        from positional_encodings import PositionalEncodingPermute1D, PositionalEncodingPermute2D, PositionalEncodingPermute3D
        
        p_enc_1d = PositionalEncodingPermute1D(10)
        x = torch.zeros((1,10,6))
        print(p_enc_1d(x).shape) # (1, 10, 6)
        
        p_enc_2d = PositionalEncodingPermute2D(8)
        y = torch.zeros((1,8,6,2))
        print(p_enc_2d(y).shape) # (1, 8, 6, 2)
        
        p_enc_3d = PositionalEncodingPermute3D(11)
        z = torch.zeros((1,11,5,6,4))
        print(p_enc_3d(z).shape) # (1, 11, 5, 6, 4)
        ```
        
        ## Thank you
        
        Thank you for [this](https://github.com/wzlxjtu/PositionalEncoding2D) repo for inspriration of this method.
        
        ## Citations
        1D:
        ```bibtex
        @inproceedings{vaswani2017attention,
          title={Attention is all you need},
          author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
          booktitle={Advances in neural information processing systems},
          pages={5998--6008},
          year={2017}
        }
        ```
        
        2D:
        ```bibtex
        @misc{wang2019translating,
            title={Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training},
            author={Zelun Wang and Jyh-Charn Liu},
            year={2019},
            eprint={1908.11415},
            archivePrefix={arXiv},
            primaryClass={cs.LG}
        }
        ```
        
        3D:
        Coming soon!
        
Keywords: transformers,attention
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
