Pages in topic:   < [1 2 3 4 5 6 7 8 9 10] >
What's your opinion on machine translation and quality?
Thread poster: Daniela Zambrini
Phil Hand
Phil Hand  Identity Verified
China
Local time: 01:58
Chinese to English
How MT can get things right May 26, 2014

Tom in London wrote:

I suppose MT may be useful - but not for anything important, because the mistakes it makes are very serious.

Tom's absolutely right at the moment. MT makes ridiculous mistakes all the time. I think there's a very straightforward reason for this: a computer is just messing with text, it has no concept of meaning or consequences.

This could change. It'll happen gradually, incrementally, but it could. For example, those Google cars are now learning to drive. When a car has to drive a route, it will start to understand that there is a difference between "left" and "right". Substituting one for the other isn't an error, it's a catastrophe.

This is how computers will learn, by integrating themselves into parts of our life where there are consequences to getting the natural language things wrong. It'll take time, but I have faith! (At the moment, faith is all it is...)


 
Michael Wetzel
Michael Wetzel  Identity Verified
Germany
Local time: 18:58
German to English
pre-edited MT May 26, 2014

Like Christine, I think that pre-edited, tailor-made MT systems based on controlled language and serious investments have a substantial future. These would consist of more or less CAT-type TMs and glossaries combined with rules and training for copy writers to insure that their texts correspond to these materials and translators/ terminologists on staff to expand and revise the system as needed.

I consider MT à la Google Translate and post-editing a fairy tale, just like the automa
... See more
Like Christine, I think that pre-edited, tailor-made MT systems based on controlled language and serious investments have a substantial future. These would consist of more or less CAT-type TMs and glossaries combined with rules and training for copy writers to insure that their texts correspond to these materials and translators/ terminologists on staff to expand and revise the system as needed.

I consider MT à la Google Translate and post-editing a fairy tale, just like the automated translation that they were sure that they would have in place by the end of the 1950s or 1960s based on a different flawed concept. Never say never, but I certainly don't think that this kind of MT is on anything even vaguely resembling the right path at the present time and I have no idea what the right path would be.
Collapse


 
DLyons
DLyons  Identity Verified
Ireland
Local time: 17:58
Spanish to English
+ ...
MT may be moving faster than people think. May 27, 2014

Yes, it can give exactly the opposite to what is meant.

OTOH, for example, take the following well-known sentence and put it through random languages in GT and it comes out unchanged. No "Chinese Whispers" - I don't see even professional human translators matching that in a properly controlled experiment.

Time flies like an arrow.
El tiempo vuela como una flecha.
الوقت الذباب مثل السهم.
ਟਾਈਮ ਇਕ ਤੀਰ ਵਰਗ�
... See more
Yes, it can give exactly the opposite to what is meant.

OTOH, for example, take the following well-known sentence and put it through random languages in GT and it comes out unchanged. No "Chinese Whispers" - I don't see even professional human translators matching that in a properly controlled experiment.

Time flies like an arrow.
El tiempo vuela como una flecha.
الوقت الذباب مثل السهم.
ਟਾਈਮ ਇਕ ਤੀਰ ਵਰਗਾ ਤੇ ਉੱਡਦੀ ਹੈ.
Le temps file comme une flèche.
Время летит, как стрела.
Isikhathi sihamba like umcibisholo.
時間過得真快似箭。
Amser yn hedfan fel saeth.
Time flies like an arrow.
Collapse


 
John Fossey
John Fossey  Identity Verified
Canada
Local time: 13:58
Member (2008)
French to English
+ ...
Hitting a plateau? May 27, 2014

There's another self-defeating problem cropping up in MT. The whole basis of MT is a corpus of previously translated works that provide a statistical basis for the translation. The corpus shows that when a certain phrase or sequence of words appears in the source text, there is a statistical probability of another certain phrase or sequence of words appearing in the target.

But what's happening over time is that the corpus of translated texts is including texts which were thems
... See more
There's another self-defeating problem cropping up in MT. The whole basis of MT is a corpus of previously translated works that provide a statistical basis for the translation. The corpus shows that when a certain phrase or sequence of words appears in the source text, there is a statistical probability of another certain phrase or sequence of words appearing in the target.

But what's happening over time is that the corpus of translated texts is including texts which were themselves machine translated! This means that the mediocrity of MT is being self-reinforced - preventing the progress the MT developers are looking for. I suspect that as this happens whatever value MT had is plateauing, and in fact it may have already peaked. I think we have already seen GT in particular hit a plateau - I don't think there has been any meaningful improvement in its renderings for some time now.
Collapse


 
Russell Jones
Russell Jones  Identity Verified
United Kingdom
Local time: 17:58
Italian to English
Quick Polls May 27, 2014

There have been a few Quick Polls on the site, asking about the use of MT.

These suggest that nearly 40% of professional translators are making some use of MT.

I suspect that, while they are happy to contribute to an anonymous poll, they don't "come out" in the forums because they know they will be pelted with abuse from people with prejudices on the subject.

My view: It's lethal in the hands of amateurs but can be a very valuable tool for expert linguists
... See more
There have been a few Quick Polls on the site, asking about the use of MT.

These suggest that nearly 40% of professional translators are making some use of MT.

I suspect that, while they are happy to contribute to an anonymous poll, they don't "come out" in the forums because they know they will be pelted with abuse from people with prejudices on the subject.

My view: It's lethal in the hands of amateurs but can be a very valuable tool for expert linguists who are well versed in revising skills.

My prize winning ProZ.com translation contest entry a few years ago started with MT (now I'm sheltering from the storm!). Needless to say, the end result was unrecognisable from the MT version, but - like Neilmac -I find it extremely useful as part of the overall process.
Collapse


 
Michael Beijer
Michael Beijer  Identity Verified
United Kingdom
Local time: 17:58
Member (2009)
Dutch to English
+ ...
I love MT! May 27, 2014

Daniela Zambrini wrote:

Thank you all for your interesting points of view. This seems to be a hot topic!

There don't seem to be any posts openly in favour of MT yet.

I look forward to further debate

D.


Hi Daniela,

Well, let me go against the trend then and say that I love Google Translate (GT) and Microsoft Translator (MT). I do all kinds of different subjects, but mainly business, legal and certain kinds of technical. My pair/direction is exclusively Dutch into English and in my subject areas at least, GT and MT are often amazingly accurate. Of course, they do produce a lot of garbage, but at the end of the day they also produce a lot of very usable material. And I can assure you, Jeff, that I am not ‘back translating’, or ‘missing any important details’.

Perhaps it’s just me, but I think that the majority of the most vocal I-hate-MT crowd are not very computer savvy. Only one person even mentioned the word ‘assembly’ in this thread.

I access GT and MT via plugins in CafeTran (my CAT tool), which recently finally got something to compete with DVX2/3's special feature called ‘Deep Miner’. Deep Miner and this new feature in CafeTran (it doesn’t have a name yet but can be accessed via a switch in CT's settings called ‘Improve Auto-Assembling with machine translation’ in the latest version of CT) use machine translated output (from online sources such as GT and MT) to improve the user’s own TM data. Sometimes it works, sometimes it doesn’t, but various people are working on improving it.

Let me just say this to the naysayers: you'd be surprised at how useful MT can be. You just need to get over your reactionary and fearful attitude and come and experiment a little...

Michael


 
Tom in London
Tom in London
United Kingdom
Local time: 17:58
Member (2008)
Italian to English
Reputations May 27, 2014

Russell Jones wrote:

......
I suspect that, while they are happy to contribute to an anonymous poll, they don't "come out" in the forums .....



Or it may be that since we know that outsourcers visit these forums, translators who use MT don't want to be "outed".


 
Jeff Whittaker
Jeff Whittaker  Identity Verified
United States
Local time: 13:58
Member (2002)
Spanish to English
+ ...
Misleading May 27, 2014

Your example is misleading because that phrase has been entered in each language verbatim as a set phrase.

The same thing will happen if you search for sentences that are part of classic novels that are available in translation on-line. The computer is just finding the matching human translation.

Your sentence does not work quite as well, when you throw something in that Google doesn't have an exact human translation match for.

Time flies like an arrow. Fr
... See more
Your example is misleading because that phrase has been entered in each language verbatim as a set phrase.

The same thing will happen if you search for sentences that are part of classic novels that are available in translation on-line. The computer is just finding the matching human translation.

Your sentence does not work quite as well, when you throw something in that Google doesn't have an exact human translation match for.

Time flies like an arrow. Fruit flies like a banana
Spanish: El tiempo vuela como una flecha. Moscas de la fruta como un plátano. (flies of the fruit as a plantain)


DLyons wrote:

Yes, it can give exactly the opposite to what is meant.

OTOH, for example, take the following well-known sentence and put it through random languages in GT and it comes out unchanged. No "Chinese Whispers" - I don't see even professional human translators matching that in a properly controlled experiment.

Time flies like an arrow.
El tiempo vuela como una flecha.
الوقت الذباب مثل السهم.
ਟਾਈਮ ਇਕ ਤੀਰ ਵਰਗਾ ਤੇ ਉੱਡਦੀ ਹੈ.
Le temps file comme une flèche.
Время летит, как стрела.
Isikhathi sihamba like umcibisholo.
時間過得真快似箭。
Amser yn hedfan fel saeth.
Time flies like an arrow.


[Edited at 2014-05-27 17:35 GMT]
Collapse


 
Russell Jones
Russell Jones  Identity Verified
United Kingdom
Local time: 17:58
Italian to English
Understandable May 27, 2014

[quote]Tom in London wrote:

.. it may be that since we know that outsourcers visit these forums, translators who use MT don't want to be "outed".


Understandable - but that just reinforces the myth that it is something to be ashamed of.

We should be talking to our clients as authorities on the subject, backed up by real expertise; otherwise they will find suppliers who are.


 
DLyons
DLyons  Identity Verified
Ireland
Local time: 17:58
Spanish to English
+ ...
Not entirely May 27, 2014

Jeff Whittaker wrote:

Your example is misleading because that phrase has been entered in each language verbatim as a set phrase.

The same thing will happen if you search for sentences that are part of classic novels that are available in translation on-line. The computer is just finding the matching human translation.

Your sentence does not work quite as well, when you throw something in that Google doesn't have an exact human translation match for.


I haven't checked, but I'd be surprised if it were the case for all those languages. GT has some sort of internal representation that is very good at avoiding the "multiple photocopy effect".

I actually did a more complex exercise with a verse from Merriman's "Cúirt an Meán-Oíche"
(right hand column is a human translation). The internals occasionally peep out "Indifference", "Riverwalk", "Heavy, "lightweight", "ignore". For me, the end result is amazingly good.

Oddly enough, the biggest deficiency is one I often see just in EnFr - it gets the sense, not mildly wrong, but diametrically opposed. Here this is the "not" in "I do not have the Riverwalk" (first picked up in the ChFi step). And even in the much simpler EnFr, it is unable to handle "pas" = "not" adequately.


Ba ghnáth mé ag siúl le ciumhais na h-abhann, It was my custom walking beside the river,
Ar bháinseach úr 'san drúcht go trom, On the fresh green grass in heavy dew,
In aice na gcoillte i gcoim an tsléibhe, Beside the woods in a recess of the mountain,
Gan mhairg, gan mhoill le soilse an lae. Without worry, without delay at the [first] light of day.

Normally I would walk to the edge of the river,
Fresh sward 'in heavy dew,
Near the woods in the middle of the mountain,
Indifferently, without delay with the day lighting.

通常我會步行到河邊,
新鮮的草皮'在重露,
在附近的山中間的樹林,
漠然,立即與採光。

ਮੈਨੂੰ ਆਮ ਤੌਰ 'ਤੇ ਨਦੀ ਨੂੰ ਤੁਰ,
ਭਾਰੀ ਤ੍ਰੇਲ ਵਿੱਚ ਤਾਜ਼ਾ ਮੈਦਾਨ ',
ਪਹਾੜ ਦੇ ਨੇੜੇ ਜੰਗਲ ਦੀ ਮੱਧ ਵਿੱਚ,
Indifference, ਤੁਰੰਤ ਅਤੇ ਰੋਸ਼ਨੀ.

En yleensä River Walk,
Heavy kaste tuore maahan,
Lähellä vuori keskellä metsää,
Välinpitämättömyys, nopea ja kevyt.

Emi ko maa n Riverwalk,
Heavy ìri titun ilẹ,
Sunmọ òke ni arin awọn igbo,
Ainaani, sare ati lightweight.

Mən yoxdur adətən Riverwalk,
New Ağır şeh,
Meşə ortasında bir təpə yaxınlığında,
Sürətli və yüngül, ignore.

Agam nach bhfuil sé de Riverwalk,
Drúcht Trom Nua,
Tá cnoc in aice le lár na coillte,
Fast agus lightweight, neamhshuim a dhéanamh de.

I do not have the Riverwalk,
New Heavy dew,
There is a hill near the middle of the woods,
Fast and lightweight, to ignore.


 
Christine Andersen
Christine Andersen  Identity Verified
Denmark
Local time: 18:58
Member (2003)
Danish to English
+ ...
I know enough not to trust it May 27, 2014

I was recently asked (pro bono) to translate a set of about 50 artists' statements for an exhibition.

A few had to be translated from Danish into English, and the rest were written in English of sorts - everything from excellent native through fluent to pure MT. My husband and I set to work to translate these into Danish.

They were short texts describing art work and philosophy, so even with some of the fluent descriptions, we had to search the Net for pictures of the
... See more
I was recently asked (pro bono) to translate a set of about 50 artists' statements for an exhibition.

A few had to be translated from Danish into English, and the rest were written in English of sorts - everything from excellent native through fluent to pure MT. My husband and I set to work to translate these into Danish.

They were short texts describing art work and philosophy, so even with some of the fluent descriptions, we had to search the Net for pictures of the works and more background on the artists. So far so good.

The really challenging ones were obviously output from GoogleTranslate - and one of the most difficult was a Dane who had tried to set up the whole of her website in English, so we had no Danish source to help us! Finally we tried to back-translate from English to Danish with GoogleTranslate, but there had been enough human intervention to muddle that up too. We think we guessed right in the end, but were not entirely sure!

My husband struggled a whole day with another story peppered with terms where GT had given up... But luckily most of the artists had managed to write something that made sense in English, at least when we had seen pictures of their work.

GT simply cannot cope with a lot of the creative-philosophical language we were dealing with there.

On the other hand, I have seen quite impressive results with controlled language and a dedicated search engine.

The secret is the input.
It is far easier to train humans to adapt to the limitations of computers and pre-edit, using controlled terminology, than it is to program computers to cope with the creativity and variety of the human brain.
Collapse


 
Tom in London
Tom in London
United Kingdom
Local time: 17:58
Member (2008)
Italian to English
Warning May 28, 2014

Russell Jones wrote:

We should be talking to our clients as authorities on the subject, backed up by real expertise; otherwise they will find suppliers who are.


Recently, one of the agencies for which I work issued a very severe general warning to all its suppliers that MT must not be used and that any evidence of its having been used would result in that supplier being removed from their books.

This is a high-quality agency that serves important clients and is expert at spotting the use of MT.

Even after "correction" I think the language is always stilted and strange unless you rewrite the whole thing; which negates any usefulness that MT might have.

[Edited at 2014-05-28 07:45 GMT]


 
Giles Watson
Giles Watson  Identity Verified
Italy
Local time: 18:58
Italian to English
In memoriam
MT as TM May 28, 2014

Last year, Jean-François Richard and his team at Terminotix very kindly let me run an IT-EN translation memory (100,000+ segments) past their Portage machine translation engine. Let me say straight away that it wasn’t about money. The (my) aim was to find out whether Portage could produce a draft translation of sectoral texts for post-editing that would save time over a human translation-plus-revision workflow.

The results were not particularly encouraging in this case. I asked s
... See more
Last year, Jean-François Richard and his team at Terminotix very kindly let me run an IT-EN translation memory (100,000+ segments) past their Portage machine translation engine. Let me say straight away that it wasn’t about money. The (my) aim was to find out whether Portage could produce a draft translation of sectoral texts for post-editing that would save time over a human translation-plus-revision workflow.

The results were not particularly encouraging in this case. I asked some colleagues to look over Portage-generated translations - and even back-translations of texts already in the TM - but few of them thought the versions would be of much help.

When you are revising human translations, you can “trust” the original translator to get certain things right and others wrong. Before long, you know what to look for. MT mistakes, however, are random, which makes reviewing the draft translation considerably more laborious. Like the dictionary-digging school of dodgy translators, MT programs translate stochastically with unpredictable results. Neither engage in conceptual thought.

None of this is any reflection on Portage, of course. The TM Jean-François and his colleagues let me play around with is tiny by MT standards. It contains a wealth of subject-specific linguistic variables that have to be learned by human translators, let alone machines with no other material to go on. And my back-of-an-envelope methods were anything but scientific.

One thing I did gain from this exercise, though, was a cleaner, leaner TM. By treating machine translation databases as corpora, Michael W, Michael B and neilmac are probably getting as much out of MT as is currently possible for a freelance translator.



[Edited at 2014-05-28 18:10 GMT]
Collapse


 
Patrick Porter
Patrick Porter
United States
Local time: 13:58
Spanish to English
+ ...
experiments creating my own MT engines May 29, 2014

Giles Watson wrote:

Last year, Jean-François Richard and his team at Terminotix very kindly let me run an IT-EN translation memory (100,000+ segments) past their Portage machine translation engine. Let me say straight away that it wasn’t about money. The (my) aim was to find out whether Portage could produce a draft translation of sectoral texts for post-editing that would save time over a human translation-plus-revision workflow.

The results were not particularly encouraging in this case....


I've also worked with JFR and Terminotix, developing a custom Trados plugin for the Portage engine. So I'm familiar with what they do, although not with the exact kind of system they use behind their API.

In any event, this project and several similar ones got me interested in experimenting with statistical MT on my own. I've actually had some fairly good results using the Moses open source application/project to create some simple phrase-based MT engines using my own TMs.

In one case, the training data consisted of a TM as small as 50,000 segments, representing translations done over a couple of years for one company in the IT field. The company produces an integrated software development suite that uses its own high-level programming language. My custom MT works particularly well on the reference documentation for that programming language (written in a similar style to MSDN or the Java specification), since it involves short phrases and generally does not even require complete sentences. The results are terrible for other documents translated for the same company, like marketing texts and blog posts, even though the TMs used to train the engine have lots of these types of segments in them.

I have also had similar results with translations for other clients that don't involve complex prose, using much larger TMs (300-500k segments). In all of these examples, none of the respective engines works well on translations for other clients, or on different types of texts for the same clients. I'm not sure whether increasing the size of the corpus or varying the training data would make these engines much better at complex texts. I suspect not. But it does seem that in very specific use cases these types of self-created MT engines can be helpful to a qualified human translator.

In cases where the training data consists of mostly my own translations approved by the client, the output of these MT engines ends up prompting me to stay consistent with style and terminology. It's somewhat like going beyond a simple TM lookup and doing a sort of instant recursive concordance search for all the words/phrases in the source segment.

However, I definitely would not call this "post-editing" and really do not believe that phrase to be an accurate description of an experienced human translator using an MT engine as an aid or prompt. It's still translation to me. For the reasons that others have mentioned, it involves the same level of analysis/critical-thinking, linguistic knowledge, and subject-matter expertise.


 
Tom in London
Tom in London
United Kingdom
Local time: 17:58
Member (2008)
Italian to English
This thing will never work May 29, 2014

Apart from the inane smiles and the repulsive, glowing up-beatness of the participants in the video, it's obvious they had rehearsed their (extremely dull) conversation in advance and that they were both reading their responses from a script.

Microsoft’s ‘Star Trek’ voice translator:

http://gu.com/p/3ptnk


 
Pages in topic:   < [1 2 3 4 5 6 7 8 9 10] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What's your opinion on machine translation and quality?






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »