A comparison of translation performance between DeepL and Supertext

As strong machine translation (MT) systems are increasingly based on large language models (LLMs), reliable quality benchmarking requires methods that capture their ability to leverage extended context. This study compares two commercial MT systems -- DeepL and Supertext -- by assessing their performance on unsegmented texts. We evaluate translation quality across four language directions with professional translators assessing segments with full document-level context. While segment-level assessments indicate no strong preference between the systems in most cases, document-level analysis reveals a preference for Supertext in three out of four language directions, suggesting superior consistency across longer texts. We advocate for more context-sensitive evaluation methodologies to ensure that MT quality assessments reflect real-world usability.

Read the entire research paper on arXiv.

News

Translate right inside your work environment with the new Supertext add-ins for Microsoft Office

February 16, 2026

Angela Lanza-Mariani

News

Translate even more consistently – with the new team glossaries and style guides

January 21, 2026

Angela Lanza-Mariani

Translate from Polish into English online – accurate, fast and free with advanced AI

December 5, 2025

Angela Lanza-Mariani

A comparison of translation performance between DeepL and Supertext

More posts

Translate right inside your work environment with the new Supertext add-ins for Microsoft Office

Translate even more consistently – with the new team glossaries and style guides

Translate from Polish into English online – accurate, fast and free with advanced AI