This article was originally posted on WomenLearnThai.com.
Google Translate takes Babel Fish one step further…
In a previous post – Google Translates Documents and Email Too – we got into a discussion about the possible crowdsourcing aspects built into Google Translate.
Even ten years after Babelfish was first online the quality of these free online translators hasn’t improved much. Even today they can only be used to get an idea of what the text is about, but nothing more. And even today there are still so many using these tools blindly, believing the output is anything reasonable in the target language.
From Jeff Gray:
One of the things about Google Translate that makes it different to the others is that it allows users to correct mistranslations. When you look at translated text, if you point the mouse at a sentence, a window pops up that shows the original text and offers you the ability to translate it better. This feeds into the translation engine for similar phrases in the future. So the more it is used and corrected, the better it gets. This is a brilliant way of leveraging the language skills of vast numbers of users.
To which I replied:
But who is responsible for correcting those sending in their fixes to the mistranslations? I find crazy stuff for Thai-English all the time so there is that.
Jeff Gray came back with:
Catherine, you asked what happens if someone suggests a poor or mischievous translation. I don’t know how they handle it, there is no mention of the process online.
It might be set up to work the way that Wikipedia works. While there are cases of accidental or deliberate errors, the sheer volume of people adding useful stuff to Wikipedia makes it immensely useful. Wikipedia is also inherently self correcting, because if someone writes rubbish, it will be corrected by others.
In the same way, having millions of users making minor improvements to the translation system does something that any single company could never do with internal resources only. It might be chaotic, but the sheer scale is unbeatable.
Time will tell, but the approach they’ve taken is potentially a very effective one.
Today is Friday, the 17th of July, 2009.
To check out this theory, I’ll gather a collection of Thai sentences ranging from dead simple to difficult, then run them through Google Translate. I’ll do the same with their English counterparts.
On or around the 17th of July, 2010, I will go back to Google Translate to see if anything has changed for the better.
See you then…
9 thoughts on “Thai Google Translate: Will Crowdsourcing Work?”
Maybe Google Translate will have to look closer at what Wikipedia is up to –>> Wikipedia to Color Code Untrustworthy Text
I opened the spam email from Latvia and Google Translate detected Polish, giving me an option to check the Polish – English translation or choose more from a drop down.
I was curious, so I googled to discover that there is indeed a large Polish community in Latvia.
So Google Translate did double duty: I received the translation, and I also received a mini history lesson.
‘it can probably get a lot of small sentence scale translation reasonably well done’
This week I’ll post a range of Thai sentences translated using Google Translate (these are the very same sentences I’ll check on in a year). Some came out ok, but others need the Thai touch.
I didn’t think about using Engrish.com to understand the errors, but it is a very good idea. On the course I took with Stuart Jay Raj (Cracking Thai Fundamentals), one class focused on why Thais pronounce English the way they do. We think of them as errors but for the Thais it is the obvious way to pronounce letter combos.
What we learned from Stu was indeed useful for me to know, as I then had a better shot at pronouncing English place names the Thai way. It is especially needed when giving directions to taxi drivers!
You can say Carrefour all you want the English or even the French way, but if you don’t say it the Thai way, you’ll never get to do that bit of needed shopping.
“I can now read my spam messages from Latvia.”
See, I knew there was a great social benefit! 🙂
I don’t know enough Thai to know what you mean by inference based. But I guess with any language knowing context is all important.
At its best Translate can recognise patterns of words in sentences. So it can probably get a lot of small sentence scale translation reasonably well done. But it will not cope with long, complex sentences or paragraphs at all well, since it doesn’t really understand as a human does how things relate.
We all have a good laugh at the literal mistranslations by machines, such as “out of sight, out of mind” being translated as “invisible idiot”. This is the sort of thing I think Google Translate will become better at handling, because it’s a fixed phrase, an easy pattern to spot when it’s used repeatedly.
It’s when translation requires more than statistical pattern matching it will fail. Or the people who do the stuff on Engrish.com for that matter…
Even though it is for fun, Engrish.com is interesting to look at and think about causes of the errors we see there. Learning languages is hard work, but fascinating.
Google Translate works great in Gmail. I can now read my spam messages from Latvia.
Google Translate has a gadget for websites that goes especially well with widgets: Google Translate Tools
I’ve added the gadget to WLT to see how it works (you can see it top right). Oddly enough, it does not have Thai.
WLTs stats say that a number of people are reading WLT using translators, so this might help (from western to western anyway).
If you read further down in that Google Page, you’ll read that you can also drag a translator link of choice to your browser’s toolbar to operate from there.
Hi Kirti, no I haven’t had the chance to look at asiaonline.com as I’ve been a wee bit snowed under lately. I’d be interested in seeing your before and afters.
Have you had a look at the Thai on http://www.asiaonline.com website?
We are attempting to translate the wikipedia using SMT + Professional post-editing + crowdsourcing
I would be curious to hear your opinion – I would be happy to send you some samples to show you how it improves.
Hi Jeff, Fantastic on finding the forum for Google Translate (I’ll go have a lurk). Thanks!
Like you, I don’t feel that translators have anything much to worry about just yet.
But translators do have to worry about educating their clients on the benefits of a professional job.
With all of the crowdsourcing design contest sites around, it is happening in the design industry too. And with equally horrible results. But some clients don’t have a clue about what makes good design, so designers also have the additional job of educating their clients. At no extra charge.
But with the way Thai is set up, with inferences being a part of the understanding, I don’t see how Thai translators will be out of a job any time soon.
I’ve been wrong before though… if a computer can write a novel, then it may (eventually) be able to connect the dots in Thai.
Engrish.com – I’m going to be laughing about this one all day…
‘Warning: Please do not take the product playing with people who have a weak heart and seem to be out of their minds.’
Again, thank you for your advice on this subject. Google Translate will be an interesting one to watch during the next year or so.
Great idea for a test. Be very interesting to see how it progresses.
I’ve found a bit more about how Google Translate works. It is a statistical system, not rules based. So to work well, it needs a large volume of parallel professional translations. (Human translators aren’t being done out of a job just yet…)
Forum for Google Translate is here. There’s no post from within Google describing in detail what happens with user contributions that I could see.
This would mean that an individual posting a poor translation doesn’t hurt. The statistics would choose the most common translation for a given sentence in preference to an individual’s unique one.
If we have the choice, human translation will be preferable for a long time yet. But at least having machine translation gives us something where we would have had nothing previously. It might be confusing, misleading & hilarious, but at least it’s something.
I’ve seen plenty of Chinese manufacturers goods with poor English manuals done by humans. Google Translate is better than some of these 😉