Illustration of person relaxing on phone
Illustration of person relaxing on phone

Alex Flückiger

Research

February 5, 2025

A comparison of translation performance between DeepL and Supertext



As strong machine translation (MT) systems are increasingly based on large language models (LLMs), reliable quality benchmarking requires methods that capture their ability to leverage extended context. This study compares two commercial MT systems -- DeepL and Supertext -- by assessing their performance on unsegmented texts. We evaluate translation quality across four language directions with professional translators assessing segments with full document-level context. While segment-level assessments indicate no strong preference between the systems in most cases, document-level analysis reveals a preference for Supertext in three out of four language directions, suggesting superior consistency across longer texts. We advocate for more context-sensitive evaluation methodologies to ensure that MT quality assessments reflect real-world usability.


Reserach_Figures.png

Read the entire research paper on arXiv.

More posts
Supertext’s secure language AI now available in 28 languages
News

Supertext’s secure language AI now available in 28 languages

July 23, 2025


Angela Lanza-Mariani