From a Million Words to Fifty Thousand

April 3, 2019by Ian A. Henderson
Blog-3-1280x850.jpg

How translation memory cuts costs and elevates Global Content

As digital information expands, translation memory (TM) evolves with it. And today, TM systems are the most used translation applications in the world. A TM system is a complex undertaking that requires a particular skill set.

What is translation memory? In short, translation memory is a comprehensive database that recycles previous translations to be used in new text. By leveraging past translations, a translator can assess whether an automatically generated suggestion is appropriate for the text they’re adapting.

Uwe Reinke of Cologne University of Applied Sciences explains it as such:

“The idea behind its core element, the actual “memory” or translation archive, is to store the originals and their human translations of e-content in a computer system, broken down into manageable units, generally one sentence long. Over time, enormous collections of sentences and their corresponding translations are built up in the systems.”

This process not only saves time and effort, but maintains a high level of quality and consistency across Global Content projects.

The key benefits of translation memory

  • Cumulative savings

    A TM database “learns” from previous projects. When you begin a new one, the new text is segmented and analyzed against past translations to produce matches in your database. Over time, the accumulation of translation memory “knowledge” decreases costs on future translations, while expanding the depth of your text database.

  • Quick Turnaround

    Rubric was tasked with delivering a new level of weather personalization and global localization with AccuWeather’s Universal Forecast Database. From English to Korean, Rubric was able to reduce 1,000,000 words for translation to just 50,000. And though it took a year, we consider the completion of a project this vast to be quick turnaround. For further information about AccuWeather, keep reading.

  • Superior translations

    TM also aids in a translator’s accuracy and output. By aligning your business’s vocabulary, tone, and style, you give a translator the foundation they need to produce high quality translations.

The role of machine translation in translation memory

Simply put, machine translation (MT) is the automation of the translation process by computer. Where translation memory requires a human translator, machine translation is used in combination with TM to hasten project delivery without the need for human input.

There are a number of MT engines available:

  • Generic

    Google Translate, Bing, and similar are grouped here. These platforms provide quick translations to millions of people around the world and can be purchased by companies for API-integration into their systems.

  • Customizable

    An MT element that can be used to improve the accuracy of a business’s vocabulary within a specific field, be it medical, legal, or financial. Customizable MT can factor in a company’s own style and lexicon too.

  • Adaptive

    Introduced by Lilt in 2016, followed by SDL a year later, adaptive MT has greatly improved a translator’s output and is expected to challenge TM in the coming years.

In all cases, MT will attempt to create translated sentences from what it’s learned. For example, it may parse two or three TM matches and automatically combine them to complete a sentence. The result is often the kind of garbled, ungrammatical translation Google Translate produces at times. Because of this risk, a human translator should be available to audit and edit the results for project success.

Gaining efficiencies from large, repetitive texts such as product catalogues is an art that Rubric excels at. We analyze and filter texts to breakdown the component phrases and reduce the unique text for translation. Here’s how we introduce the human element into the act of translation.

How does Rubric use translation memory?

We briefly mentioned our involvement in AccuWeather’s Universal Forecast Database. Through content analysis and manipulation, we were able to translate an exhaustive database of weather phrases into form forecasts such as “sunny, mostly clear, with changing clouds in the afternoon”. Because the component phrase ‘sunny’ was repeated in the file thousands of times, we wanted to ensure we leveraged one translation for all of the repetitions to save costs. We achieved this by translating the above example phrase and ‘sunny’ separately.

Translators were then able to focus on the unique component phrases, while checking them against full weather forecast phrases for grammatical accuracy. With this approach we were able to reduce the scope of the database project from 1,000,000 words to around 50,000. The resultant savings in both cost and time were staggering.

Previous translations where the source text is identical to the new text, or partially matches it, can also be stored in translation memory. In either case, the TM will propose any matching database entries for the translator to use as they see fit.

TM can also be programmed to store translations by product. This is vital for when you have a new product and want to prioritize the order of multiple product TMs to assess how appropriate multiple translations would be. For example, using Windows XP terminology versus Windows 8, or Android terminology against iOS.

 

 

Rubric is a customer-centric, Global Content Partner. We partner with multinational companies to help them achieve their global strategy goals. Need help expanding globally? A trusted Global Content Partner will guide, expand, and strengthen the quality and impact of your translation. Sign up for a two-day workshop where we’ll analyze actual content examples from your business to show you how we can house, maintain and manipulate your TMs in a structured, consistent way across markets.

Subscribe to our blog

Updates on global content strategy, engineering tips, localization how-to and more - straight to your inbox!

Thank you for subscribing.
Ian A. Henderson

Ian A. Henderson

Ian is co-founder of Rubric. During the last 25 years, Ian has partnered with Rubric customers to deliver relevant Global Content to their end users, enabling them to reap the rewards of globalization, benefit from agile workflows, and guarantee the integrity of their content. Prior to founding Rubric, Ian worked as a software engineer for Siemens in Germany.

Follow Our Activity

Stay up to date with our latest activity relating to Global Content.