Mastering Text Alignment
Text alignment of multilingual documents is one of the most widely used procedures in our industry. Alignment is a process that allows creating translation memories from already translated documents. This can be very useful to leverage previous work and improve consistency and productivity. However, as we will see in this article, alignment has other uses.
This procedure is usually a double-edged sword: while it helps us to recover old translations that have not gone through a CAT tool?and thus reuse the content, doing it incorrectly can result in spending too much time on the process to obtain quality and reusable alignments and therefore be a completely counterproductive tool.
When I worked at Montero Language Services, besides my translation tasks, I had to do others such as proofreading or editing. Over time, the system we worked with was computerized and updated and almost everything went through CAT tool. However, sometimes those DOCX files still arrived to be reviewed against their original in PDF or, DOCX documents that had been previously translated but whose content the client had modified and we had to introduce those changes in the translated document using a new original document in PDF as reference.?Obviously it has never been an option to work in Word for several reasons, but especially because I was not willing to give up an automated linguistic QA, or strain my eyes, or above all, have that work available for future projects. That’s how my love-hate story with alignment began.
Tips to keep in mind before performing an alignment
How to align bilingual texts using OmegaT
Now that we have the previous points clear, let’s get to work.
1. The first thing is to get the documents in the same format and, in addition, in a format that is CAT tool friendly. If we have PDF we will use optical character recognition (OCR), if we have CSV, we will convert it to XLSX, and so on with all file formats.
2. Next, we must format the documents so that they are as similar as possible. This means that, if we use OCR to convert two documents to DOCX, we have to edit and format both texts in Word so that those texts have exactly the same appearance.
3. The next step is optional, but I highly recommend it because we will make sure to have full control of the segmentation and tag handling. Use your CAT tool to create two projects, one for Text A and another for Text B. From those projects we obtain the corresponding XLIFF files, thanks to this step, we can detect any discrepancy in the segmentation and we can mimic our CAT tool tag handling.
4. Now, we open OmegaT and go to the Tools > Align Files... menu.
领英推荐
The OmegaT aligner will appear and we will only have to select the languages and the XLIFF files that we want to align.
And when you click on OK, the Autoaligner runs, there you have several options that you can modify until you get a more than decent automated result. It is important to keep in mind that, since our XLIFF files were already segmented previously, so that OmegaT does not resegment those files, we uncheck that option and thus maintain the original segmentation of the XLIFF files.
In the second step of the alignment, OmegaT allows us to manually edit misalignment errors, and finally create our TMX ready to be used in our CAT tool and projects.
It is very important to point out that the alignment may still have some errors and it is very convenient to review the TMX in a translation memory editor such as Heartsome TMX Editor?or to apply some penalty to the translation memory at the time of use to avoid autopopulation of erroneous translation units.
Business interpreting Russian-Dutch. Technical, legal and commercial translations from French, English and Russian.
12 个月Really love your style. You should be teaching at a university.
Founder & Owner in Montero Language Services
1 年A master class Victor, I see you didn't waste any time. I wish you all the best
Expert Trados Trainer/Consultant since 2014 | Strong Expertise in Translation software: CAT TOOL, TMX/TBX editors & Aligners
1 年Great article as always! Informative post!
Multilingual Translator | Subtitler | Interpreter English, French, Spanish > European Portuguese | Member of SUBTLE — the Subtitlers’ Association
1 年Wow ?? such good tips and I loved your metaphors ??. You know how to deliver a message, Víctor Parra ??