Deutsch   |   中文

CAT Optimization


MS Office
CAT: Trados Studio, MemoQ, Idiom, Star Transit
OS: Windows, MacOS

Business Hours

Monday to Friday
9:00 to 12:00 a.m. and 1:00 to 5:30 p.m. Indochina time (2:00 a.m. to 5:00 a.m., 6:00 a.m. to 10:30 a.m. UTC)


Bank transfer to Vietnam, Paypal, Credit Card

When converting, we pay special attention to optimizing the document for CAT use as well as applying a logical structure to reduce the post-formatting workload after translation.

CAT Optimization

  • All formatting is applied manually (cf. note on automatic conversion below).
  • We make sure the text flow is correct and avoid excess line breaks, which would result in wrong segmentation by CAT tools. This is particularly important since some applications such as Trados Studio 2009-2011 do not allow merging segments that are separated by line breaks.
  • Minimalist approach – unnecessarily complicated formatting such as nested structures (tables within tables, tables within frames, etc.) can lead to CAT errors in the worst case.
  • Consistent styles-based formatting – differences in formatting can result in CAT applications not recognizing repetitions, thus leading to a higher no-match wordcount. This fact is often insufficiently appreciated and can strongly inflate costs: this sample Word file, for example, contains 5 identical sentences that exhibit only minor, hard-to-notice formatting differences. An analysis with an empty translation memory still returns nothing but "no matches" (Trados 7).

Measures to reduce the post-formatting workload

  • In order to minimize the post-formatting workload, we format the document logically. For example, a table is used only if the text is logically divided into cells, while columns are employed if the text at the end of one column is continued at the top of the next column; and images that are used within a sentence ("click button [image] to load a file") are inserted as inline, not floating pictures. These and other measures make it likely that the target language text is automatically placed in the correct position. Illogical formatting, on the other hand, can lead to nasty surprises after the target files are generated. A classical "sin" is the use of tabulators when a table would have been appropriate. Because of the different length of the target text, such structures almost always fall apart, and fixing them requires cumbersome and time-comsuming manual work.
  • We use MS Word Styles to structure documents. This allows generating an automatic table of contents and makes adjustments easier.
  • To make image annotations editable, we usually employ grouped text boxes, meaning the image can still be moved together with all its annotations. To take into account that the target language text might be longer than the source text, we use text boxes that are larger than required by the source text, as long as they don't interfere with the graphical elements of the picture. This decreases the probability that the text boxes need to be ungrouped and resized after translation.

Automatic Conversion

Logitope uses automatic conversion tools only as a preliminary step in order to read scanned text (OCR). All formatting is then erased and re-applied manually. Automatic conversion tools sometimes achieve a visually very accurate reproduction of the original, but only at the price of excessive use of formatting attributes and an extremely complicated and usually illogical structure that make adjustments frustrating and time-consuming. In many cases, the original text flow is destroyed, for example because words that syntactically belong together are placed in different text boxes. An additional problem with regard to tag-based CAT tools (such as Trados Tageditor, Trados Studio, MemoQ, Idiom, Star Transit, etc.) is that the excessive use of formatting attributes by automatic conversion software can result in a very large of number of tags that make translation all but impossible (see picture below).

Broken text flow and tags in Trados Studio 2009
resulting from automatic conversion

Automatic conversion software is valuable for certain purposes, in particular optical character recognition. Logitope uses ABBYY Finereader and Nuance Omnipage.