Understanding the concepts behind machine translation
NEUROTECH AFRICA
We are a tech company that focuses on helping businesses thrive through AI and data-driven solutions
This article initially published on the?Neurotech Africa?blog
From this article, you will understand the concept of Machine Translation, including its background, types, technologies for Machine Translation, and the current state of Machine Translation.
Machine Translation is wonderful technology but it is not wonderful to trust by Mohd Mustafa
Machine Translation started around the 1950s and involve a lot of manual processing, where some limitations such as the power of computing, data availability, and storage capabilities were really challenging.
Around the 2000s, the emergence of statistical databases was used by developers to teach computers to translate text but the issue of manual labor was still applicable.
Around 2016, Developers from Google come up with the exciting idea of using Neural learning models and Artificial Intelligence to train translation engines. The built machine translation engine from Google shows a significant improvement compared to the early existed machine translation engines. Both effective and faster in terms of text translations performance across many languages were improved.
Neural Machine Translation proved so effective that Google changed course and adopted it as its primary development model. Followed by Microsoft and Amazon.
What is Machine Translation(MT)?
Machine Translation?is the process of using Artificial Intelligence to automatically translate content from one language to another without human input.
Using language software that learns over time and can be customized to include static business nomenclature is an asset. Machine language translation can save significant time as it is capable of translating entire text documents in seconds. However, please bear in mind that human translators should always have post-edit translations done by MTs.
Employees can communicate and collaborate across time zones with MT software. With a shared knowledge of corporate terminology, the likelihood of judgment errors is diminished.
Using software that learns over time and can be customized to include static business nomenclature is an asset. Simply most MT software provides consistent translations. Feelings and opinions are often reflected in human translations, and sentiment can be altered depending on who is doing the translation.
Types of Machine Translations
The honest truth about the perfect area to apply MTs, More structured content works better with MTs such as technical documentation, Intellectual property, etc
Colloquial content like marketing and branding or other customer-facing content MTs is optional to use simply because the results will need more human editing.
Machine Translations do vary according to their use cases, selecting the right MTs tool for your business depends on the use case, budget, and computing power some of the MTs are too expensive and you can incur a cost that will not add profit to your business.
By understanding the types of MTs, it will help to select the right choice depending on the use case, Let’s look at what those MTs types are:-
Language barriers affect various business activities. As the world becomes smaller with technology, businesses encounter difficulty accommodating the needs of an increasingly international consumer base. Hiring a translation company and translators can be expensive, but utilizing technology to perform document translation services is a cost-effective option for increasing understanding and promoting inclusivity.
How does Machine Translation Work?
It is very interesting to understand how Machine Translation engines work such as?Masakhane translate, Google translate,?Amazon,?Microsoft Translator, etc
We will look at Neural Machine Translation, as the most used form of MTs innovation technology in the world currently.
Neural Machine Translation?is a single system that can be trained directly on the source and target text without the need for specialized systems compared to SMTs.
In simple words, In order to teach a machine how to perform translation, you should have data, a collection of millions sentences depending on the languages you want to work on. which shows correct translations and fits those sentences into Neural Networks then It will learn how to translate between those example sentences so you can see in order for a translator to become smart, it should be exposed or trained with millions of sentences.
Sounds easy?
Oops! No, it is very technical let’s see what is behind this
领英推荐
Every language has 2 important components:
How about?It is Sunny, the sentence has only 3 tokens which are?It,?is?, and?sunny?If languages were only dependent on tokens and grammar be ignored language translation could be much easier.
Grammar is a sensitive case in language translation, It involves?syntax analysis?and?semantic analysis?this is where the complexity of translation begins, simply because languages differ in their syntax and semantics.
But do computers understand human language grammar the same way as we humans do?
The answer is No, simply because computers do understand numbers.
Instead of defining the grammar for the computer to understand,?Neural Networks?do it for you. Neural Network is able to learn the pattern in data and is able to translate from a source language(for example Swahili) to a target language(for example English).
Inputs and outputs are both sentences but the computer takes them as numeric values. First of all, they perform the conversion in numeric forms(vectors and matrices).
{Sentence (Swahili) — to — Vector form}
The resulting vector should be converted into a second language(English)
{Vector form — to — Sentence(English)}
This process is called?encoder-decoder architecture.
This architecture can be modified by applying various improved methods rather than?Recurrent Neural Networks(RNN), Simply because RNN does not check before and after the token makes imperfect translations. The improved way of RNN is?Long Short-Term Memory(LSTM) were it able to check before and after the token but is still confused with long sentences.
The improved version of LSTM is?Bidirectional Recurrent Neural Networks. Instead of running an RNN only in the forward mode starting from the first token, we start another one from the last token running from back to front.?Bidirectional RNNs?add a hidden layer that passes information in a backward direction to more flexibly process.
Then, let’s Finalize with the?Attention mechanism, The?attention mechanism?is a part of a neural architecture that enables to dynamically highlighting of relevant features of the input data, which, in NLP, is typically a sequence of textual elements. It can be applied directly to the raw input or to its higher-level representation.
A neural network is considered to be an effort to mimic human brain actions in a simplified manner. The attention Mechanism is an attempt to implement the same action of selectively concentrating on a few relevant things while ignoring others in deep neural networks. More about attention mechanism?here?we go.
Final Thoughts
As translation tools become more reliable, there will be more competition for translation agencies to deliver better quality and faster turnaround translations.
This means marketing and sales will be needed to stay competitive in the market. Being able to sell your services will be crucial to convincing clients to choose your agency instead of your competitors.
Translational Agencies will be always be needed to provide accurate services, and proofreading in order to eliminate errors, simply because most of the existing translation tools are limited to the amount of data that has trained.
Also in terms of confidential information, companies are not ready to expose their confidential data online such as contracts and medical documents, etc.
Words travel worlds. Translator to the driving by Anna Rusconi.