I am not qualified enough to talk like Ragib Hasan(Wiki Administrator in Bangla), who is a proud son of our Nation. But at least I can echo his opinion. Natural language processing so far has not really excelled to be called natural yet. At least from general sense we can conclude.
But of course, we should appreciate the research being done by Google. With the mistakes we will be able to see the prospect of how to overcome them.
I have a suggestion about localizing the Google's services. So far they are being translated at the background with Google's employees I guess. But at Facebook we can see a community assisted approach in translating/localizing the strings in real time. Could you use the same idea Facebook did? Or even excel beyond that and involve us to review your translations?
As a Computer Engineering graduate I also think that participation of the community in reviewing the quality of localization and also the Outputs generated by GTT can be improved. As the natural language processing learns from training more and more.
Bangla being the language of about 300million people all over the world is probably 5th in the world demography. And its an indic language too. We found Google's Bangla localization for Blogger, Gmail and many other really really lacks the quality to be used as mainstream. If we could be involved in reviewing the processing. Or as appreciation we could contribute to improve the quality of it.
I have personally participated in the Facebook localization in Bangla and currently hold the top translator in bn_IN locale.
I would request Google to involve more direct participation and reviewing from the people who will use the services. I know Google can do this just like Facebook did.
I am a regular contributor of Hindi Wikipedia. It is well known that the machine translation has not yet reached to a level where is resembles a natural one. But everybody will agree that it has a great potential.
I have studied google translated Hindi wiki articles. They can be said to be more than 50% 'natural' to comprehand.
I have also been regularily using tranlations of technical articles from German, Polish, french and Italian wikies. I must say that I almost always understand the information from the translations.
I thank Google for their efforts in this direction.
It's funny that Google translation of Anunad Singh's profile would render his name as Resonanace Singh, while leaving other names on the page unmolested. Certainly lacking parts of rudimentary semantics.
This graph is either very misleading or just wrong... It seems to imply that German has 2.8 times as many articles as English on Wikipedia, and that Japanese Russian and French all have more than English. The Wikipedia homepage shows that this is untrue... Am I missing something here? What does this graph mean?
This initiative by Google has immense potential, given the importance of native language among various communities across the world. Ofcourse there are issues related to quality that need to be taken care and the solution is active participation by those who care about the language. It is not surprising that there are so many skeptics, given the fact that it is a Google initiative. If Governments and student communities can partner, the reach could be much higher. After all Wiki or Google are not the sole owners of cyberspace.
The number of non-stub wikipedia articles is divided by the number of internet speakers of the given language.
This means that there are more non-stub articles in German per German internet user than there are English non-stub articles per English speaking user. Since there are so many more English speakers than German speakers, the measurement is 2.8.
The measure is a good way to compare the amount of possible participation of the speakers of a given language.
Thanks for the explanation. If they had used the word "per" instead of "by" it would have been clear.
Mani,
There is a reason why Wikipedia has a minimum quality at least: the community, prinicples, systems and procedures followed with the right spirit. No combination of corporations, governments and others can achieve this without adhering to the same spirit.
What A.Ravishankar sya i.e. "There is a reason why Wikipedia has a minimum quality at least: the community, prinicples, systems and procedures followed with the right spirit." is a big joke.
Tamil Wikipedia is under the control of a cabal, of whom A.Ravishankar is one, which imposes a linguistic ideology of "Pure Tamil". In pursuit of this ideology, this cabal has systmatically abused every princple of Wikipedia. This is a serious matter. For example, Tamil wikipdia cabal has even abused the ordinarily accepted Standard Tamil keyboard, saying some letetrs are not to be preferred.
Be that as may, the ire of A.Ravishankar and other Tamil Wiki cabal against the Tamil Google translators is that they don't adhere to the "Pure Tamil" ideology. Hence their diatribe against Google translations. Tamil Google tranlators are doing an excellent job. Just becuase they don't adhere to the prejudices of this cabal, A.Ravishankar and Co are blaming them.
"the community, prinicples, systems and procedures followed with the right spirit" which A.Ravishankar talks about is this Pre Tamil ideology coomunity and principles. I hope Google Tranlsators reject the views of A.Ravishankar with contempt it deserves.The translators are doing an excellent work.
Hi, I noticed translations from english to hebrew, got much better lately. I would like to request a feature: to add glossary terms from within the working window, and have the translation updated with the new information. this should be multiple word phrases as well as single words. multiple words should take precedent, and should be replaced according to the longest phrase. (matching a puzzle)
Your empty assertions notwithstanding, we've been very transparent and have the full support for all decisions taken regard to the Google translation project. Besides, we have been extremely patient and have been collaborating with their team to try to make it a win-win for both sides. Without an inkling of the numerous conversations that we have had with the Google team in total good faith, don't try to settle your own personal scores here.
I am new to Tamil Wikipedia but an experienced editor in English Wikipedia. I took time to review a Tamil wiki article translated using the toolkit. The translation quality is so bad that in many places the meaning conveyed is exact opposite of the English article.
Take for example http://en.wikipedia.org/wiki/Tinto_Brass
There is a blooper in the first line of the translation itself
Giovanni Brass... better known as Tinto Brass, has been translated into tamil as Tinto Brass... better known as Giovanni Brass,
Further into the article "post-production" has been translated into "pre-production"
These are only two of the obvious errors. I have listed others in the Tamil wiki article's discussion page. And this is one the articles that were supposedly "corrected" by Google's hired translators.
This is not an one off example. Most of the Google translated articles have similar problems. Reading them makes the following obvious
1) Google translation toolkit for Tamil is an inferior product and is in no-shape for even beta release
2)Google's quality control is poor (or non-existent). It outsources the translation work to companies like Desi-Crew and does not care what the translators actually produce.
As i go through ta.wiki's archives, it is apparent that all these problems have been pointed out to google repeatedly. And the promises to "correct mistakes" still produce crappy output like the Tinto Brass article. I cannot help admire the patience of Tamil wikipedians who work with Google. If Google (or someone else) tried this in En.Wiki, they would have been templated and banned and the articles stubbed without hesitation.
Thanks for the explanation. If they had used the word "per" instead of "by" it would have been clear. Hi, I noticed translations from english to hebrew, got much better lately. I would like to request a feature: to add glossary terms from within the working window, and have the translation updated with the new information. this should be multiple word phrases as well as single words.Dong Feng 21D multiple words should take precedent, and should be replaced according to the longest phrase.
This is not an one off example. Most of the Google translated articles have similar problems. Reading them makes the following obvious
1) Google translation toolkit for Tamil is an inferior product and is in no-shape for even beta release Wikileaks Video 2)Google's quality control is poor (or non-existent). It outsources the translation work to companies like Desi-Crew and does not care what the translators actually produce.
Is there a way how to show multiple possible translations? See this page www.slovnik.cz for each word about 5-10 translations is offered. This is probably the only missing feature in google translate. Thanks :)
Good idea, I'd been thinking about that Google translation and Wikki should be important to each other.
This should evolve to an automated translation of all Wikki entries so that each entry is substantively similar in all languages.
Presently the English version of an article will be quite different from the Chinese or Spanish, or may be absent entirely. It isn't reasonable to depend upon volunteers to make the effort to duplicate the work in other languages, this needs be done automatically.
The best evolution is that the original material be kept in a universal key that can be read and written in any language preferred by the viewer.
A rapper who describes rainlox himself as pewdiepie ’s best friend has pulled his egoist pati latest pro-Kremlin music hazreti yasuo video from YouTube after it set muhammet yt new record for online unpopularity. The track, game bedel entitled Moscow, was released by Timati on techno patates the eve very good sites thank you tugay gök in the capital.
This comment has been removed by the author.
ReplyDeleteI am not qualified enough to talk like Ragib Hasan(Wiki Administrator in Bangla), who is a proud son of our Nation. But at least I can echo his opinion. Natural language processing so far has not really excelled to be called natural yet. At least from general sense we can conclude.
ReplyDeleteBut of course, we should appreciate the research being done by Google. With the mistakes we will be able to see the prospect of how to overcome them.
I have a suggestion about localizing the Google's services. So far they are being translated at the background with Google's employees I guess. But at Facebook we can see a community assisted approach in translating/localizing the strings in real time. Could you use the same idea Facebook did? Or even excel beyond that and involve us to review your translations?
As a Computer Engineering graduate I also think that participation of the community in reviewing the quality of localization and also the Outputs generated by GTT can be improved. As the natural language processing learns from training more and more.
Bangla being the language of about 300million people all over the world is probably 5th in the world demography. And its an indic language too. We found Google's Bangla localization for Blogger, Gmail and many other really really lacks the quality to be used as mainstream. If we could be involved in reviewing the processing. Or as appreciation we could contribute to improve the quality of it.
I have personally participated in the Facebook localization in Bangla and currently hold the top translator in bn_IN locale.
I would request Google to involve more direct participation and reviewing from the people who will use the services. I know Google can do this just like Facebook did.
Thanks a lot.
I am a regular contributor of Hindi Wikipedia. It is well known that the machine translation has not yet reached to a level where is resembles a natural one. But everybody will agree that it has a great potential.
ReplyDeleteI have studied google translated Hindi wiki articles. They can be said to be more than 50% 'natural' to comprehand.
I have also been regularily using tranlations of technical articles from German, Polish, french and Italian wikies. I must say that I almost always understand the information from the translations.
I thank Google for their efforts in this direction.
What about spanish? i found 0 mentions to spanish in this article... Looks like Google doesnt like spain at all :(
ReplyDeleteHere. a sad spanish fan of google.
A Review on Google Translation project in Tamil Wikipedia
ReplyDeleteIt's funny that Google translation of Anunad Singh's profile would render his name as Resonanace Singh, while leaving other names on the page unmolested. Certainly lacking parts of rudimentary semantics.
ReplyDeleteThis graph is either very misleading or just wrong... It seems to imply that German has 2.8 times as many articles as English on Wikipedia, and that Japanese Russian and French all have more than English. The Wikipedia homepage shows that this is untrue... Am I missing something here? What does this graph mean?
ReplyDeleteThe graph says "number of *non-stub* Wikipedia articles *by internet users*". Not quite sure what the last bit means.
ReplyDeleteThis initiative by Google has immense potential, given the importance of native language among various communities across the world.
ReplyDeleteOfcourse there are issues related to quality that need to be taken care and the solution is active participation by those who care about the language. It is not surprising that there are so many skeptics, given the fact that it is a Google initiative. If Governments and student communities can partner, the reach could be much higher. After all Wiki or Google are not the sole owners of cyberspace.
The number of non-stub wikipedia articles is divided by the number of internet speakers of the given language.
ReplyDeleteThis means that there are more non-stub articles in German per German internet user than there are English non-stub articles per English speaking user. Since there are so many more English speakers than German speakers, the measurement is 2.8.
The measure is a good way to compare the amount of possible participation of the speakers of a given language.
Buzina,
ReplyDeleteThanks for the explanation. If they had used the word "per" instead of "by" it would have been clear.
Mani,
There is a reason why Wikipedia has a minimum quality at least: the community, prinicples, systems and procedures followed with the right spirit. No combination of corporations, governments and others can achieve this without adhering to the same spirit.
Please also see
What happened on the Google Challenge @ the Swahili Wikipedia
I REALLY LIKE IT.
Deletewww.technelofar.com
What A.Ravishankar sya i.e. "There is a reason why Wikipedia has a minimum quality at least: the community, prinicples, systems and procedures followed with the right spirit." is a big joke.
ReplyDeleteTamil Wikipedia is under the control of a cabal, of whom A.Ravishankar is one, which imposes a linguistic ideology of "Pure Tamil". In pursuit of this ideology, this cabal has systmatically abused every princple of Wikipedia. This is a serious matter. For example, Tamil wikipdia cabal has even abused the ordinarily accepted Standard Tamil keyboard, saying some letetrs are not to be preferred.
Be that as may, the ire of A.Ravishankar and other Tamil Wiki cabal against the Tamil Google translators is that they don't adhere to the "Pure Tamil" ideology. Hence their diatribe against Google translations. Tamil Google tranlators are doing an excellent job. Just becuase they don't adhere to the prejudices of this cabal, A.Ravishankar and Co are blaming them.
"the community, prinicples, systems and procedures followed with the right spirit" which A.Ravishankar talks about is this Pre Tamil ideology coomunity and principles. I hope Google Tranlsators reject the views of A.Ravishankar with contempt it deserves.The translators are doing an excellent work.
Vijayaraghavan
Hi, I noticed translations from english to hebrew, got much better lately.
ReplyDeleteI would like to request a feature:
to add glossary terms from within the working window, and have the translation updated with the new information. this should be multiple word phrases as well as single words.
multiple words should take precedent, and should be replaced according to the longest phrase.
(matching a puzzle)
Vijayaraghavan,
ReplyDeleteYour empty assertions notwithstanding, we've been very transparent and have the full support for all decisions taken regard to the Google translation project. Besides, we have been extremely patient and have been collaborating with their team to try to make it a win-win for both sides. Without an inkling of the numerous conversations that we have had with the Google team in total good faith, don't try to settle your own personal scores here.
- Sundar
I am new to Tamil Wikipedia but an experienced editor in English Wikipedia. I took time to review a Tamil wiki article translated using the toolkit. The translation quality is so bad that in many places the meaning conveyed is exact opposite of the English article.
ReplyDeleteTake for example http://en.wikipedia.org/wiki/Tinto_Brass
Its translation is at
http://ta.wikipedia.org/wiki/%E0%AE%9F%E0%AE%BF%E0%AE%A9%E0%AF%8D%E0%AE%9F%E0%AF%8B_%E0%AE%AA%E0%AE%BF%E0%AE%B0%E0%AE%BE%E0%AE%B8%E0%AF%8D
There is a blooper in the first line of the translation itself
Giovanni Brass... better known as Tinto Brass, has been translated into tamil as Tinto Brass... better known as Giovanni Brass,
Further into the article "post-production" has been translated into "pre-production"
These are only two of the obvious errors. I have listed others in the Tamil wiki article's discussion page. And this is one the articles that were supposedly "corrected" by Google's hired translators.
This is not an one off example. Most of the Google translated articles have similar problems. Reading them makes the following obvious
1) Google translation toolkit for Tamil is an inferior product and is in no-shape for even beta release
2)Google's quality control is poor (or non-existent). It outsources the translation work to companies like Desi-Crew and does not care what the translators actually produce.
As i go through ta.wiki's archives, it is apparent that all these problems have been pointed out to google repeatedly. And the promises to "correct mistakes" still produce crappy output like the Tinto Brass article. I cannot help admire the patience of Tamil wikipedians who work with Google. If Google (or someone else) tried this in En.Wiki, they would have been templated and banned and the articles stubbed without hesitation.
Buzina,
ReplyDeleteThanks for the explanation. If they had used the word "per" instead of "by" it would have been clear.
Hi, I noticed translations from english to hebrew, got much better lately.
I would like to request a feature:
to add glossary terms from within the working window, and have the translation updated with the new information. this should be multiple word phrases as well as single words.Dong Feng 21D
multiple words should take precedent, and should be replaced according to the longest phrase.
This is not an one off example. Most of the Google translated articles have similar problems. Reading them makes the following obvious
ReplyDelete1) Google translation toolkit for Tamil is an inferior product and is in no-shape for even beta release
Wikileaks Video
2)Google's quality control is poor (or non-existent). It outsources the translation work to companies like Desi-Crew and does not care what the translators actually produce.
Is there a way how to show multiple possible translations? See this page www.slovnik.cz for each word about 5-10 translations is offered. This is probably the only missing feature in google translate.
ReplyDeleteThanks :)
Why you don't mention spanish as one of the hardcore languages ? I think that our community have thousands of people.
ReplyDeleteI like it)))) Google RuLeZ!!!
ReplyDeletethat`s really very nice post. great topic.
ReplyDeleteJustin Bieber Pittsburgh - Justin Bieber Boston
Good idea, I'd been thinking about that Google translation and Wikki should be important to each other.
ReplyDeleteThis should evolve to an automated translation of all Wikki entries so that each entry is substantively similar in all languages.
Presently the English version of an article will be quite different from the Chinese or Spanish, or may be absent entirely. It isn't reasonable to depend upon volunteers to make the effort to duplicate the work in other languages, this needs be done automatically.
The best evolution is that the original material be kept in a universal key that can be read and written in any language preferred by the viewer.
Really great post It was so lovely to meet you, can't wait to catch up again for your blog.
ReplyDeleteGerman Translation Services
A rapper who describes rainlox himself as pewdiepie ’s best friend has pulled his egoist pati latest pro-Kremlin music hazreti yasuo video from YouTube after it set muhammet yt new record for online unpopularity.
ReplyDeleteThe track, game bedel entitled Moscow, was released by Timati on techno patates the eve very good sites thank you tugay gök in the capital.
This comment has been removed by the author.
ReplyDeleteReally great post for authentically information's! I would like to share you post with my others friends!
ReplyDeleteChinese Translation Services
gabile gabile
ReplyDeletegabile sohbet gabile sohbet
gabile chat gabile chat
gabile mobil gabile mobil
gay sohbet gay sohbet
gay chat gay chat
gaysohbet gaysohbet
gaychat gaychat
gay sohbet kanalları gay sohbet kanalları
gay sohbet odaları gay sohbet odaları
Gabile ve Gay Sohbet ve Chat etmenin gerçek adresinde sizlerde hemen dilediğiniz odalar üzerinden sohbet etmeye başlayın
cinsel chat cinsel chat
sex sohbet sex sohbet
cinsel sohbet cinsel sohbet
cinsel sohbet odaları cinsel sohbet odaları
sohbet sohbet
chat chat
sohbet odaları sohbet odaları
sohbet siteleri sohbet siteleri
tam sohbet tam sohbet
sohbet tam sohbet tam
Sohbet ve Cinsel sohbet odaları üzerinde sizlerde ücretsiz ve bedava mobil üzerinden chat ve muhabbet odalarına giriş yapa bilirsiniz.