7 tokenizers. 5 languages. 3,500 data points.
Yorùbá costs up to 3.5× more to process than English.
Same meaning. Same API. Different price.
GPT-4o · $2.50/1M input tokens · ₦1,350/$1
1.0× = same as English. Higher = more expensive to process.
| GPT-4 | GPT-4o | Llama 3 | Gemma 2 | Mistral | BERT-ML | XLM-R | |
|---|---|---|---|---|---|---|---|
| Yorùbá | 3.4× | 2.4× | 3.1× | 2.9× | 3.5× | 2.4× | 2.8× |
| Igbo | 2.4× | 1.5× | 2.3× | 2.3× | 2.5× | 2.2× | 2.3× |
| Hausa | 1.8× | 1.5× | 1.8× | 1.6× | 1.9× | 1.5× | 1.3× |
| Pidgin | 1.1× | 1.1× | 1.1× | 1.1× | 1.1× | 1.1× | 1.0× |