Metadata-Version: 2.4
Name: tfs-mt
Version: 0.1.2
Summary: Transformer from scratch for Machine Translation
Project-URL: Homepage, https://giovo17.github.io/tfs-mt/
Project-URL: Repository, https://github.com/Giovo17/tfs-mt
Project-URL: Documentation, https://giovo17.github.io/tfs-mt/
Author-email: Giovanni Spadaro <giovannispada17.gs@gmail.com>
License-File: LICENSE
Keywords: deep-learning,machine-translation,nlp,nmt,pytorch,transformer
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: <4.0,>=3.10
Requires-Dist: datasets>=4.0.0
Requires-Dist: jaxtyping>=0.3.3
Requires-Dist: mkdocstrings[python]>=1.0.0
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: pytorch-ignite>=0.5.3
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: requests>=2.32.4
Requires-Dist: tdqm>=0.0.1
Requires-Dist: torch>=2.9.0
Requires-Dist: transformers>=4.57.1
Requires-Dist: types-pyyaml>=6.0.12.20250822
Requires-Dist: types-requests>=2.32.4.20250809
Requires-Dist: zensical>=0.0.11
Description-Content-Type: text/markdown


<div align="center">

<h1>tfs-mt<br>
Transformer from scratch for Machine Translation</h1>

[![Release](https://img.shields.io/github/v/release/Giovo17/tfs-mt)](https://github.com/Giovo17/tfs-mt/releases)
[![Build status](https://img.shields.io/github/actions/workflow/status/Giovo17/tfs-mt/main.yml?branch=main)](https://github.com/Giovo17/tfs-mt/actions/workflows/main.yml?query=branch%3Amain)
[![License: MIT](https://img.shields.io/github/license/Giovo17/tfs-mt)](https://github.com/Giovo17/tfs-mt/blob/main/LICENSE)
[![pypi monthly downloads](https://img.shields.io/pypi/dm/tfs-mt)](https://pypi.org/project/tfs-mt/)


[▶️ Getting started](#-getting-started) • [📖 Documentation](https://giovo17.github.io/tfs-mt) • [🤗 Hugging Face](https://huggingface.co/giovo17/tfs-mt) • [🎬 Demo](https://huggingface.co/spaces/giovo17/tfs-mt-demo)

</div>

<hr>

This project implements the Transformer architecture from scratch considering Machine Translation as the usecase. It's mainly intended as an educational resource and a functional implementation of the architecture and the training/inference logic.

## Getting Started

### From pip

```bash
pip install tfs-mt
```

### From source

#### Prerequisites

- `uv` [[install](https://docs.astral.sh/uv/#installation)]

#### Steps
```bash
git clone https://github.com/Giovo17/tfs-mt.git
cd tfs-mt

uv sync

cp .env.example .env
# Edit .env file with your configuration
```

## Usage

### Training

To start training the model with the default configuration:

```bash
uv run src/train.py
```

### Inference

To run inference using the trained model from the [HuggingFace repo](https://huggingface.co/giovo17/tfs-mt):

```bash
uv run src/inference.py
```

### Configuration

The whole project parameters can be configured in `src/tfs_mt/configs/config.yml`. Key configurations include:

- **Model Architecture**: Config, dropout, GloVe embedding init, ...
- **Training**: Optimizer, Learning rate scheduler, number of epochs, ...
- **Data**: Dataset, Dataloader, Tokenizer, ...

## Architecture

For a detailed explanation of the architecture and design choices, please refer to the [Architecture Documentation](https://giovo17.github.io/tfs-mt/architecture_explain/).

### Model Sizes

The project supports various model configurations to suit different computational resources:

| Parameter                | Nano     | Small    | Base     | Original |
| :----------------------- | :------- | :------- | :------- | :------- |
| **Encoder Layers** | 4        | 6        | 8        | 6        |
| **Decoder Layers** | 4        | 6        | 8        | 6        |
| **d_model**        | 50       | 100      | 300      | 512      |
| **Num Heads**      | 4        | 6        | 8        | 8        |
| **d_ff**           | 200      | 400      | 800      | 2048     |
| **Norm Type**      | PostNorm | PostNorm | PostNorm | PostNorm |
| **Dropout**        | 0.1      | 0.1      | 0.1      | 0.1      |
| **GloVe Dim**      | 50d      | 100d     | 300d     | -        |

## Documentation

Full documentation is available at [https://giovo17.github.io/tfs-mt/](https://giovo17.github.io/tfs-mt/).

## Citation

If you use `tfs-mt` in your research or project, please cite:

```bibtex
@software{Spadaro_tfs-mt,
author = {Spadaro, Giovanni},
license = {MIT},
title = {{tfs-mt}},
url = {https://github.com/Giovo17/tfs-mt}
}
```
