Machine translation is the automated translation of a text, based on computer algorithms and without human involvement.
There are three main types of machine translation systems: rules-based, statistical, and neural.
Rules-based systems use linguistic information collected from dictionaries, and grammar rules on the main semantic, morphological, and syntactic structures of a language. This type of machine translation can be useful in more technical translations, as specialist dictionaries can also be created for these texts that focus on specific industries or disciplines and contain lots of technical language or jargon.
Unlike rules-based systems, statistical systems have no knowledge of language rules. Instead, they "learn" to translate by analysing large amounts of data for each language pair. Typically, statistical systems deliver translations that sound more fluent, albeit less consistent and accurate.
Neural Machine Translation (NMT) is a new approach whereby machines “learn” to translate using an artificial neural network. The approach has become increasingly popular amongst MT researchers and developers, as trained NMT systems have begun to show better translation performance in many language pairs compared to other approaches.
The first set of proposals for computer-based machine translation was presented in 1949 by Warren Weaver and sparked a wave of research across many universities in the United States. On 7 January 1954 the first public demonstration of a machine translation system took place. The demonstration garnered press attention and attracted public interest far and wide. Although not very advanced, it established the idea that machine translation was a very real and imminent possibility and encouraged the financing of the research, not only in the US but also worldwide.
At the end of the 1950s, Yehoshua Bar-Hillel was asked by the US government to assess the possibility of a fully automatic high-quality machine translation. Here, Bar-Hillel called to attention the problem of semantic ambiguity, i.e., words with a double-meaning. To illustrate his concerns, he gave an example of using the word “pen” in the sentences “The pen is in the box” and “The box is in the pen”.
Since then, the technology behind machine translation has advanced massively, making automated translations widely available to people, whether they may be translators, students, or tourists, around the globe.
Nevertheless, despite the impressive progress made, Bar-Hillel’s argument surrounding semantic ambiguity remains an issue in the field of machine translation today and is a large contributing factor as to why machine translation is still not widely accepted as an adequate total replacement for human translators.
Other factors at play in human vs machine translation debate include computers’ inability to make creative decisions to maintain the tone and nuance of an original text. That said, such technology may be introduced in the foreseeable future as Andrew Ochoa, chief executive of Waverly Labs, says: “When it comes to expressing emotion and intonation, we need sentiment analysis, which is not there yet but may well be in ten years’ time.”
Photo by Krzysztof Kowalik on Unsplash