Metadata-Version: 2.1
Name: lightning-transformers
Version: 0.1.0
Summary: PyTorch Lightning Transformers.
Home-page: https://github.com/PyTorchLightning/lightning-transformers
Author: PyTorchLightning et al.
Author-email: name@pytorchlightning.ai
License: Apache-2.0
Download-URL: https://github.com/PyTorchLightning/lightning-transformers
Project-URL: Bug Tracker, https://github.com/PyTorchLightning/lightning-transformers/issues
Project-URL: Documentation, https://lightning-transformers.readthedocs.io/en/stable/
Project-URL: Source Code, https://github.com/PyTorchLightning/lightning-transformers
Keywords: deep learning,pytorch,AI
Platform: UNKNOWN
Classifier: Environment :: Console
Classifier: Natural Language :: English
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Python: >=3.6
Description-Content-Type: text/markdown
Requires-Dist: pytorch-lightning (>=1.2.4)
Requires-Dist: torch (>=1.6)
Requires-Dist: numpy
Requires-Dist: torchmetrics (>=0.2.0)
Requires-Dist: fairscale
Requires-Dist: transformers
Requires-Dist: datasets
Requires-Dist: hydra-core (>=1.1.0.dev4)
Requires-Dist: rouge-score
Requires-Dist: sentencepiece

<div align="center">

<img src="https://github.com/PyTorchLightning/lightning-transformers/raw/0.1.0/docs/source/_static/images/logo.png" width="500px">

**Flexible interface for high performance research using SOTA Transformers leveraging [Pytorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning), [Transformers](https://github.com/huggingface/transformers), and [Hydra](https://github.com/facebookresearch/hydra).**

<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/lit-tfmrs.gif">

---

<p align="center">
  <a href="#what-is-lightning-transformers">What is Lightning Transfomers</a> •
  <a href="#using-lightning-transformers">Using Lightning Transformers</a> •
  <a href="https://lightning-transformers.readthedocs.io/">Docs</a> •
  <a href="#community">Community</a> •
  <a href="#license">License</a>
</p>

---

</div>

## Installation

#### Option 1: from PyPI

```bash
pip install lightning-transformers
# instead of: `python train.py ...`, run with:
pl-transformers-train ...
```

#### Option 2: from source
```bash
git clone https://github.com/PyTorchLightning/lightning-transformers.git
cd lightning-transformers
python train.py ...
# the `pl-transformers-train` endpoint is also available!
```

</details>

---

## What is Lightning-Transformers

Lightning Transformers offers a flexible interface for training and fine-tuning SOTA Transformer models using the [PyTorch Lightning Trainer](https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html).

* **Train using [HuggingFace Transformers](https://github.com/huggingface/transformers) models and datasets** with Lightning custom Callbacks, Loggers, Accelerators and high performance scaling.
* **Seamless Memory and Speed Optimizations** such as [DeepSpeed ZeRO](https://pytorch-lightning.readthedocs.io/en/latest/multi_gpu.html#deepspeed) or [FairScale Sharded Training](https://pytorch-lightning.readthedocs.io/en/latest/multi_gpu.html#sharded-training) with no code changes.
* **Powerful config composition backed by [Hydra](https://hydra.cc/)** - Easily swap out models, optimizers, schedulers and many more configurations without touching the code.
* **Transformer Task Abstraction for Rapid Research & Experimentation** - Built from the ground up to be task agnostic, the library supports creating transformer tasks across all modalities with little friction.

Lightning Transformers tasks allow you to train models using HuggingFace Transformer models and datasets, use Hydra to hotswap models, optimizers or schedulers and leverage all the advances features that Lightning has to offer, including custom Callbacks, Loggers, Accelerators and high performance scaling with minimal changes.

## Using Lightning-Transformers

**Grid** is our platform for training models at scale on the cloud! Sign up [here](https://www.grid.ai/).


| Task                 | Quick Commands                                                                                           | Run |
|----------------------|-----------------------------------------------------------------------------------------------------------|-----|
| [Language Modeling](https://lightning-transformers.readthedocs.io/tasks/nlp/language_modeling.html)    | ```python train.py task=nlp/language_modeling dataset=nlp/language_modeling/wikitext trainer.gpus=1 training.batch_size=8```    |[![Grid](https://img.shields.io/badge/rid_AI-run-78FF96.svg?labelColor=black&logo=data:image/svg%2bxml;base64,PHN2ZyB3aWR0aD0iNDgiIGhlaWdodD0iNDgiIGZpbGw9Im5vbmUiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PHBhdGggZD0iTTEgMTR2MjBhMTQgMTQgMCAwMDE0IDE0aDlWMzYuOEgxMi42VjExaDIyLjV2N2gxMS4yVjE0QTE0IDE0IDAgMDAzMi40IDBIMTVBMTQgMTQgMCAwMDEgMTR6IiBmaWxsPSIjZmZmIi8+PHBhdGggZD0iTTM1LjIgNDhoMTEuMlYyNS41SDIzLjl2MTEuM2gxMS4zVjQ4eiIgZmlsbD0iI2ZmZiIvPjwvc3ZnPg==)](https://platform.grid.ai/#/runs?script=https://github.com/PyTorchLightning/lightning-transformers/blob/009d016f/train.py&cloud=grid&instance=p3.2xlarge&accelerators=1&disk_size=200&framework=lightning&script_args=--grid_name%20transformers-run%20%5C%0A--grid_strategy%20grid_search%20%5C%0A--grid_disk_size%20200%20%5C%0A--grid_max_nodes%2010%20%5C%0A--grid_datastore_mount_dir%20%2Fopt%2Fdatastore%20%5C%0A--grid_instance_type%20p3.2xlarge%20%5C%0A--grid_credential%20cc-b87v8%20%5C%0A--grid_framework%20lightning%20%5C%0A--grid_gpus%201%20%5C%0Atrain.py%20task%3Dnlp%2Flanguage_modeling%20dataset%3Dnlp%2Flanguage_modeling%2Fwikitext%20trainer.gpus%3D1%20training.batch_size%3D8)     |
| [Multiple Choice](https://lightning-transformers.readthedocs.io/tasks/nlp/multiple_choice.html)      | ```python train.py task=nlp/multiple_choice dataset=nlp/multiple_choice/race trainer.gpus=1```            |[![Grid](https://img.shields.io/badge/rid_AI-run-78FF96.svg?labelColor=black&logo=data:image/svg%2bxml;base64,PHN2ZyB3aWR0aD0iNDgiIGhlaWdodD0iNDgiIGZpbGw9Im5vbmUiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PHBhdGggZD0iTTEgMTR2MjBhMTQgMTQgMCAwMDE0IDE0aDlWMzYuOEgxMi42VjExaDIyLjV2N2gxMS4yVjE0QTE0IDE0IDAgMDAzMi40IDBIMTVBMTQgMTQgMCAwMDEgMTR6IiBmaWxsPSIjZmZmIi8+PHBhdGggZD0iTTM1LjIgNDhoMTEuMlYyNS41SDIzLjl2MTEuM2gxMS4zVjQ4eiIgZmlsbD0iI2ZmZiIvPjwvc3ZnPg==)](https://platform.grid.ai/#/runs?script=https://github.com/PyTorchLightning/lightning-transformers/blob/009d016f/train.py&cloud=grid&instance=p3.2xlarge&accelerators=1&disk_size=200&framework=lightning&script_args=--grid_name%20transformers-run%20%5C%0A--grid_strategy%20grid_search%20%5C%0A--grid_disk_size%20200%20%5C%0A--grid_max_nodes%2010%20%5C%0A--grid_datastore_mount_dir%20%2Fopt%2Fdatastore%20%5C%0A--grid_instance_type%20p3.2xlarge%20%5C%0A--grid_credential%20cc-b87v8%20%5C%0A--grid_framework%20lightning%20%5C%0A--grid_gpus%201%20%5C%0Atrain.py%20task%3Dnlp%2Fmultiple_choice%20dataset%3Dnlp%2Fmultiple_choice%2Frace%20trainer.gpus%3D1)     |
| [Question Answering](https://lightning-transformers.readthedocs.io/tasks/nlp/question_answering.html)   | ```python train.py task=nlp/question_answering dataset=nlp/question_answering/squad trainer.gpus=1```     |[![Grid](https://img.shields.io/badge/rid_AI-run-78FF96.svg?labelColor=black&logo=data:image/svg%2bxml;base64,PHN2ZyB3aWR0aD0iNDgiIGhlaWdodD0iNDgiIGZpbGw9Im5vbmUiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PHBhdGggZD0iTTEgMTR2MjBhMTQgMTQgMCAwMDE0IDE0aDlWMzYuOEgxMi42VjExaDIyLjV2N2gxMS4yVjE0QTE0IDE0IDAgMDAzMi40IDBIMTVBMTQgMTQgMCAwMDEgMTR6IiBmaWxsPSIjZmZmIi8+PHBhdGggZD0iTTM1LjIgNDhoMTEuMlYyNS41SDIzLjl2MTEuM2gxMS4zVjQ4eiIgZmlsbD0iI2ZmZiIvPjwvc3ZnPg==)](https://platform.grid.ai/#/runs?script=https://github.com/PyTorchLightning/lightning-transformers/blob/009d016f/train.py&cloud=grid&instance=p3.2xlarge&accelerators=1&disk_size=200&framework=lightning&script_args=--grid_name%20transformers-run%20%5C%0A--grid_strategy%20grid_search%20%5C%0A--grid_disk_size%20200%20%5C%0A--grid_max_nodes%2010%20%5C%0A--grid_datastore_mount_dir%20%2Fopt%2Fdatastore%20%5C%0A--grid_instance_type%20p3.2xlarge%20%5C%0A--grid_credential%20cc-b87v8%20%5C%0A--grid_framework%20lightning%20%5C%0A--grid_gpus%201%20%5C%0Atrain.py%20task%3Dnlp%2Fquestion_answering%20dataset%3Dnlp%2Fquestion_answering%2Fsquad%20trainer.gpus%3D1)     |
| [Summarization](https://lightning-transformers.readthedocs.io/tasks/nlp/summarization.html)        | ```python train.py task=nlp/summarization dataset=nlp/summarization/xsum trainer.gpus=1```                |[![Grid](https://img.shields.io/badge/rid_AI-run-78FF96.svg?labelColor=black&logo=data:image/svg%2bxml;base64,PHN2ZyB3aWR0aD0iNDgiIGhlaWdodD0iNDgiIGZpbGw9Im5vbmUiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PHBhdGggZD0iTTEgMTR2MjBhMTQgMTQgMCAwMDE0IDE0aDlWMzYuOEgxMi42VjExaDIyLjV2N2gxMS4yVjE0QTE0IDE0IDAgMDAzMi40IDBIMTVBMTQgMTQgMCAwMDEgMTR6IiBmaWxsPSIjZmZmIi8+PHBhdGggZD0iTTM1LjIgNDhoMTEuMlYyNS41SDIzLjl2MTEuM2gxMS4zVjQ4eiIgZmlsbD0iI2ZmZiIvPjwvc3ZnPg==)](https://platform.grid.ai/#/runs?script=https://github.com/PyTorchLightning/lightning-transformers/blob/009d016f/train.py&cloud=grid&instance=p3.2xlarge&accelerators=1&disk_size=200&framework=lightning&script_args=--grid_name%20transformers-run%20%5C%0A--grid_strategy%20grid_search%20%5C%0A--grid_disk_size%20200%20%5C%0A--grid_max_nodes%2010%20%5C%0A--grid_datastore_mount_dir%20%2Fopt%2Fdatastore%20%5C%0A--grid_instance_type%20p3.2xlarge%20%5C%0A--grid_credential%20cc-b87v8%20%5C%0A--grid_framework%20lightning%20%5C%0A--grid_gpus%201%20%5C%0Atrain.py%20task%3Dnlp%2Fsummarization%20dataset%3Dnlp%2Fsummarization%2Fxsum%20trainer.gpus%3D1)     |
| [Text Classification](https://lightning-transformers.readthedocs.io/tasks/nlp/text_classification.html)  | ```python train.py task=nlp/text_classification dataset=nlp/text_classification/emotion trainer.gpus=1``` |[![Grid](https://img.shields.io/badge/rid_AI-run-78FF96.svg?labelColor=black&logo=data:image/svg%2bxml;base64,PHN2ZyB3aWR0aD0iNDgiIGhlaWdodD0iNDgiIGZpbGw9Im5vbmUiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PHBhdGggZD0iTTEgMTR2MjBhMTQgMTQgMCAwMDE0IDE0aDlWMzYuOEgxMi42VjExaDIyLjV2N2gxMS4yVjE0QTE0IDE0IDAgMDAzMi40IDBIMTVBMTQgMTQgMCAwMDEgMTR6IiBmaWxsPSIjZmZmIi8+PHBhdGggZD0iTTM1LjIgNDhoMTEuMlYyNS41SDIzLjl2MTEuM2gxMS4zVjQ4eiIgZmlsbD0iI2ZmZiIvPjwvc3ZnPg==)](https://platform.grid.ai/#/runs?script=https://github.com/PyTorchLightning/lightning-transformers/blob/009d016f/train.py&cloud=grid&instance=p3.2xlarge&accelerators=1&disk_size=200&framework=lightning&script_args=--grid_name%20transformers-run%20%5C%0A--grid_strategy%20grid_search%20%5C%0A--grid_disk_size%20200%20%5C%0A--grid_max_nodes%2010%20%5C%0A--grid_datastore_mount_dir%20%2Fopt%2Fdatastore%20%5C%0A--grid_instance_type%20p3.2xlarge%20%5C%0A--grid_credential%20cc-b87v8%20%5C%0A--grid_framework%20lightning%20%5C%0A--grid_gpus%201%20%5C%0Atrain.py%20task%3Dnlp%2Ftext_classification%20dataset%3Dnlp%2Ftext_classification%2Femotion%20trainer.gpus%3D1)     |
| [Token Classification](https://lightning-transformers.readthedocs.io/tasks/nlp/token_classification.html) | ```python train.py task=nlp/token_classification dataset=nlp/token_classification/conll trainer.gpus=1``` |[![Grid](https://img.shields.io/badge/rid_AI-run-78FF96.svg?labelColor=black&logo=data:image/svg%2bxml;base64,PHN2ZyB3aWR0aD0iNDgiIGhlaWdodD0iNDgiIGZpbGw9Im5vbmUiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PHBhdGggZD0iTTEgMTR2MjBhMTQgMTQgMCAwMDE0IDE0aDlWMzYuOEgxMi42VjExaDIyLjV2N2gxMS4yVjE0QTE0IDE0IDAgMDAzMi40IDBIMTVBMTQgMTQgMCAwMDEgMTR6IiBmaWxsPSIjZmZmIi8+PHBhdGggZD0iTTM1LjIgNDhoMTEuMlYyNS41SDIzLjl2MTEuM2gxMS4zVjQ4eiIgZmlsbD0iI2ZmZiIvPjwvc3ZnPg==)](https://platform.grid.ai/#/runs?script=https://github.com/PyTorchLightning/lightning-transformers/blob/009d016f/train.py&cloud=grid&instance=p3.2xlarge&accelerators=1&disk_size=200&framework=lightning&script_args=--grid_name%20transformers-run%20%5C%0A--grid_strategy%20grid_search%20%5C%0A--grid_disk_size%20200%20%5C%0A--grid_max_nodes%2010%20%5C%0A--grid_datastore_mount_dir%20%2Fopt%2Fdatastore%20%5C%0A--grid_instance_type%20p3.2xlarge%20%5C%0A--grid_credential%20cc-b87v8%20%5C%0A--grid_framework%20lightning%20%5C%0A--grid_gpus%201%20%5C%0Atrain.py%20task%3Dnlp%2Ftoken_classification%20dataset%3Dnlp%2Ftoken_classification%2Fconll%20trainer.gpus%3D1)     |
| [Translation](https://lightning-transformers.readthedocs.io/tasks/nlp/translation.html)          | ```python train.py task=nlp/translation dataset=nlp/translation/wmt16 trainer.gpus=1```                   |[![Grid](https://img.shields.io/badge/rid_AI-run-78FF96.svg?labelColor=black&logo=data:image/svg%2bxml;base64,PHN2ZyB3aWR0aD0iNDgiIGhlaWdodD0iNDgiIGZpbGw9Im5vbmUiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyI+PHBhdGggZD0iTTEgMTR2MjBhMTQgMTQgMCAwMDE0IDE0aDlWMzYuOEgxMi42VjExaDIyLjV2N2gxMS4yVjE0QTE0IDE0IDAgMDAzMi40IDBIMTVBMTQgMTQgMCAwMDEgMTR6IiBmaWxsPSIjZmZmIi8+PHBhdGggZD0iTTM1LjIgNDhoMTEuMlYyNS41SDIzLjl2MTEuM2gxMS4zVjQ4eiIgZmlsbD0iI2ZmZiIvPjwvc3ZnPg==)](https://platform.grid.ai/#/runs?script=https://github.com/PyTorchLightning/lightning-transformers/blob/009d016f/train.py&cloud=grid&instance=p3.2xlarge&accelerators=1&disk_size=200&framework=lightning&script_args=--grid_name%20transformers-run%20%5C%0A--grid_strategy%20grid_search%20%5C%0A--grid_disk_size%20200%20%5C%0A--grid_max_nodes%2010%20%5C%0A--grid_datastore_mount_dir%20%2Fopt%2Fdatastore%20%5C%0A--grid_instance_type%20p3.2xlarge%20%5C%0A--grid_credential%20cc-b87v8%20%5C%0A--grid_framework%20lightning%20%5C%0A--grid_gpus%201%20%5C%0Atrain.py%20task%3Dnlp%2Ftranslation%20dataset%3Dnlp%2Ftranslation%2Fwmt16%20trainer.gpus%3D1)     |
|||<img width=126/>| <!-- This empty row is here to force a minimum badge width -->


### Quick recipes

#### Train [bert-base-cased](https://huggingface.co/bert-base-cased) on the [CARER](https://huggingface.co/datasets/emotion) emotion dataset using the Text Classification task.
```bash
python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion
```

<details>
  <summary>See the composed Hydra config used under-the-hood</summary>

```python
optimizer:
  _target_: torch.optim.AdamW
  lr: ${training.lr}
  weight_decay: 0.001
scheduler:
  _target_: transformers.get_linear_schedule_with_warmup
  num_training_steps: -1
  num_warmup_steps: 0.1
training:
  run_test_after_fit: true
  lr: 5.0e-05
  output_dir: .
  batch_size: 16
  num_workers: 16
trainer:
  _target_: pytorch_lightning.Trainer
  logger: true
  checkpoint_callback: true
  callbacks: null
  default_root_dir: null
  gradient_clip_val: 0.0
  process_position: 0
  num_nodes: 1
  num_processes: 1
  gpus: null
  auto_select_gpus: false
  tpu_cores: null
  log_gpu_memory: null
  progress_bar_refresh_rate: 1
  overfit_batches: 0.0
  track_grad_norm: -1
  check_val_every_n_epoch: 1
  fast_dev_run: false
  accumulate_grad_batches: 1
  max_epochs: 1
  min_epochs: 1
  max_steps: null
  min_steps: null
  limit_train_batches: 1.0
  limit_val_batches: 1.0
  limit_test_batches: 1.0
  val_check_interval: 1.0
  flush_logs_every_n_steps: 100
  log_every_n_steps: 50
  accelerator: null
  sync_batchnorm: false
  precision: 32
  weights_summary: top
  weights_save_path: null
  num_sanity_val_steps: 2
  truncated_bptt_steps: null
  resume_from_checkpoint: null
  profiler: null
  benchmark: false
  deterministic: false
  reload_dataloaders_every_epoch: false
  auto_lr_find: false
  replace_sampler_ddp: true
  terminate_on_nan: false
  auto_scale_batch_size: false
  prepare_data_per_node: true
  plugins: null
  amp_backend: native
  amp_level: O2
  move_metrics_to_cpu: false
task:
  _recursive_: false
  backbone: ${backbone}
  optimizer: ${optimizer}
  scheduler: ${scheduler}
  _target_: lightning_transformers.task.nlp..text_classification.TextClassificationTransformer
  downstream_model_type: transformers.AutoModelForSequenceClassification
dataset:
  cfg:
    batch_size: ${training.batch_size}
    num_workers: ${training.num_workers}
    dataset_name: emotion
    dataset_config_name: null
    train_file: null
    validation_file: null
    test_file: null
    train_val_split: null
    max_samples: null
    cache_dir: null
    padding: max_length
    truncation: only_first
    preprocessing_num_workers: 1
    load_from_cache_file: true
    max_length: 128
    limit_train_samples: null
    limit_val_samples: null
    limit_test_samples: null
  _target_: lightning_transformers.task.nlp.text_classification.TextClassificationDataModule
experiment_name: ${now:%Y-%m-%d}_${now:%H-%M-%S}
log: false
ignore_warnings: true
tokenizer:
  _target_: transformers.AutoTokenizer.from_pretrained
  pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
  use_fast: true
backbone:
  pretrained_model_name_or_path: bert-base-cased
```
</details>

#### Swap the backbone to [RoBERTa](https://huggingface.co/transformers/model_doc/roberta.html) and the optimizer to RMSprop:

```bash
python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion
    backbone.pretrained_model_name_or_path=roberta-base
    optimizer=rmsprop
```

<details>
  <summary>See the changed Hydra config under-the-hood</summary>

```diff
 optimizer:
-  _target_: torch.optim.AdamW
+  _target_: torch.optim.RMSprop
   lr: ${training.lr}
-  weight_decay: 0.001
 scheduler:
   _target_: transformers.get_linear_schedule_with_warmup
   num_training_steps: -1
....
tokenizer:
   pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
   use_fast: true
 backbone:
-  pretrained_model_name_or_path: bert-base-cased
+  pretrained_model_name_or_path: roberta-base
```
</details>

#### Enable [Sharded](https://pytorch-lightning.readthedocs.io/en/latest/advanced/multi_gpu.html#sharded-training) Training.
```bash
python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion \
    trainer=ddp \
    trainer/plugins=sharded
```

<details>
  <summary>See the changed Hydra config under-the-hood</summary>
Without the need to modify any code, the config updated automatically for sharded training:

```diff
optimizer:
   _target_: torch.optim.AdamW
   lr: ${training.lr}
trainer:
   process_position: 0
   num_nodes: 1
   num_processes: 1
-  gpus: null
+  gpus: 1
   auto_select_gpus: false
   tpu_cores: null
   log_gpu_memory: null
   ...
   val_check_interval: 1.0
   flush_logs_every_n_steps: 100
   log_every_n_steps: 50
-  accelerator: null
+  accelerator: ddp
   sync_batchnorm: false
-  precision: 32
+  precision: 16
   weights_summary: top
   weights_save_path: null
   num_sanity_val_steps: 2
   ....
   terminate_on_nan: false
   auto_scale_batch_size: false
   prepare_data_per_node: true
-  plugins: null
+  plugins:
+    _target_: pytorch_lightning.plugins.DDPShardedPlugin
   amp_backend: native
   amp_level: O2
   move_metrics_to_cpu: false
tokenizer:
   pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
   use_fast: true
 backbone:
   pretrained_model_name_or_path: bert-base-cased

```
</details>

#### Enable [DeepSpeed ZeRO](https://pytorch-lightning.readthedocs.io/en/latest/advanced/multi_gpu.html#deepspeed-zero-stage-2) Training.
```bash
python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion \
    trainer=ddp \
    trainer/plugins=deepspeed
```

<details>
  <summary>See the changed Hydra config under-the-hood</summary>
Without the need to modify any code, the config updated automatically for DeepSpeed:

```diff
optimizer:
   _target_: torch.optim.AdamW
   lr: ${training.lr}
trainer:
   process_position: 0
   num_nodes: 1
   num_processes: 1
-  gpus: null
+  gpus: 1
   auto_select_gpus: false
   tpu_cores: null
   log_gpu_memory: null
   ...
   val_check_interval: 1.0
   flush_logs_every_n_steps: 100
   log_every_n_steps: 50
-  accelerator: null
+  accelerator: ddp
   sync_batchnorm: false
-  precision: 32
+  precision: 16
   ...
-  plugins: null
+  plugins:
+    _target_: pytorch_lightning.plugins.DeepSpeedPlugin
+    stage: 2
+    cpu_offload: true
   amp_backend: native
   amp_level: O2
   move_metrics_to_cpu: false
...
```
</details>

#### Train with a pre-trained [t5-base](https://huggingface.co/t5-base) backbone, on the [XSUM](https://huggingface.co/datasets/xsum) dataset using the Summarization task.
```bash
python train.py \
    task=nlp/summarization \
    dataset=nlp/summarization/xsum \
    backbone.pretrained_model_name_or_path=t5-base
```

#### Train with a pre-trained [mt5-base](https://huggingface.co/google/mt5-base) backbone, on the [WMT16](https://huggingface.co/datasets/wmt16) dataset using the Translation task with 2 GPUs.
```bash
python train.py \
    task=nlp/translation \
    dataset=nlp/translation/wmt16 \
    backbone.pretrained_model_name_or_path=google/mt5-base \
    trainer.gpus=2
```

### Custom Files & Datasets

You can train, validate and test Lightning transformers tasks on your own data files, and you can extend datasets for custom processing and your own tasks.

#### [How to train, validate and test on custom files](https://lightning-transformers.readthedocs.io/tasks/advanced/custom_data.html)
#### [How to extend datasets](https://lightning-transformers.readthedocs.io/tasks/advanced/custom_datasets.html)

### Custom Tasks

#### [Extending the Language Modeling Task](https://lightning-transformers.readthedocs.io/tasks/advanced/custom_task.html)

## Contribute

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

## Community
For help or questions, join our huge community on [Slack](https://join.slack.com/t/pytorch-lightning/shared_invite/zt-f6bl2l0l-JYMK3tbAgAmGRrlNr00f1A)!

## License
Please observe the Apache 2.0 license that is listed in this repository. In addition, the Lightning framework is Patent Pending.


