PGLS Pulse: August 2024
Hello everyone,
Welcome to the PGLS Pulse: your source of timely and thoughtful news about the language services industry.
Navigating the Benefits and Limitations of Machine Translation
In the current AI landscape, Machine Translation (MT) has become a valuable tool for quickly translating numerous language pairs, leveraging advanced algorithms and sophisticated linguistic databases.
While MT offers significant benefits in terms of speed and cost-effectiveness, public-facing MT engines have several limitations. These engines have vulnerabilities in data storage that can lead to unauthorized access. Additionally, once data is uploaded to these services, users may lose control over their data.
To address these concerns, private, enterprise-grade MT solutions provided by language services providers are recommended. These solutions can be customized with client-specific glossaries and translation memory (TM) to meet the unique needs of each client, offering better control over the translation process. This approach ensures higher data security and greater accuracy in translations, aligning with the specific requirements of the organization.
Industry News
ChatGPT, Translation, and Confidentiality — ‘We May Use the Data’
OpenAI does not provide specific terms of use for its consumer services, including ChatGPT, which creates ambiguity about content submitted for translation. Instead, OpenAI addresses this in their Data Control FAQs and linked documents, revealing that data from non-API services like ChatGPT and DALL-E may be used to improve models.
There have been notable risks with such services. In 2019, Samsung employees leaked sensitive information via ChatGPT, leading to its ban in 2023 due to security concerns. Similarly, Statoil found confidential texts submitted to Translate.com visible through Google searches. These incidents highlight the importance of understanding data usage policies for public-facing machine translation services.
Read more here.
For Linguists
How Effective Are Large Language Models in Low-Resource Language Translation
A research paper published in September 2023 explores the translation capabilities of large language models (LLMs) like ChatGPT across 204 languages, including high- and low-resource languages (LRLs). The study highlights the neglect of LRLs in current machine translation (MT) systems, noting that many commercial systems either exclude them or perform poorly.
领英推荐
The findings reveal that while ChatGPT matches or exceeds traditional MT models for some high-resource languages, it generally underperforms for LRLs, particularly African languages, where it lagged in 84.1% of cases studied. The study also examined language features such as resources, family, and script to help users select appropriate MT systems.
The study underscores the importance of considering language-specific features and resources when selecting machine translation systems to ensure broader and more accurate linguistic coverage.
Read more here.
Researchers Improve AI Translation by Having Translators Give ‘Light-Weight’ Feedback
In April 2024, researchers from the University of Maryland found that providing fine-grained external feedback helps large language models (LLMs) improve their machine translation post-editing (MTPE) capabilities.
Then, in June 2024, another group of researchers demonstrated that "light-weight" feedback can effectively guide LLMs to self-correct translations, even in technical domains. They suggested a two-step method: translators first mark mistakes in the machine-generated translations using <bad></bad> tags. These tags help the AI focus on the errors and find similar examples from a database of corrected translations, known as a post-editing translation memory (PE-TM). This way, the AI learns to make more accurate corrections based on past examples.
Using in-line error tags helped models fix mistakes more accurately than translating from scratch or correcting without them. With error markings, 68% of corrections were accurate, compared to 32% without them.
Read more here.
Job Opportunities
We are always looking for talented language professionals. If you are a translator, interpreter, or language instructor, PGLS wants to work with you! Explore our open roles here.
PGLS News
Make this Newsletter Better!
Help us make every newsletter worthwhile by sharing ideas and feedback in the comments. Your input shapes the newsletter!
Translator & Software Developer
7 个月If any of you are attending the Techno Security & Digital Forensics Conference West 2024, you'll have a chance to learn about our Local AI Translator from Philip Staiger.