Google Translate Blog
The official source for news on Google's translation technologies
Translating Wikipedia
Wednesday, July 14, 2010
We believe that translation is key to our mission of making information useful to everyone. For example,
Wikipedia
is a phenomenal source of knowledge, especially for speakers of common languages such as English, German and French where there are hundreds of thousands—or millions—of articles available. For many smaller languages, however, Wikipedia doesn’t yet have anywhere near the same amount of content available.
To help Wikipedia become more helpful to speakers of smaller languages, we’re working with volunteers, translators and Wikipedians across India, the Middle East and Africa to translate more than 16 million words for Wikipedia into Arabic, Gujarati, Hindi, Kannada, Swahili, Tamil and Telugu. We began these efforts in 2008, starting with translating Wikipedia articles into Hindi, a language spoken by tens of millions of Internet users. At that time the
Hindi Wikipedia
had only 3.4 million words across 21,000 articles––while in contrast, the English Wikipedia had 1.3 billion words across 2.5 million articles.
We selected the Wikipedia articles using a couple of different sets of criteria. First, we used Google search data to determine the most popular English Wikipedia articles read in India. Using
Google Trends
, we found the articles that were consistently read over time––and not just temporarily popular. Finally we used
Translator Toolkit
to translate articles that either did not exist or were placeholder articles or “
stubs
” in Hindi Wikipedia. In three months, we used a combination of human and machine translation tools to translate 600,000 words from more than 100 articles in English Wikipedia, growing Hindi Wikipedia by almost 20 percent. We’ve since repeated this process for other languages, to bring our total number of words translated to 16 million.
We’re off to a good start but, as you can see in the graph below, we have a lot more work to do to bring the information in Wikipedia to people worldwide:
Number of non-stub Wikipedia articles by Internet users, normalized (English = 1)
We’ve also found that there are many Internet users who have used our tools to translate more than 100 million words of Wikipedia content into various languages worldwide. If you do speak another language we hope you’ll join us in bringing Wikipedia content to other languages and cultures with
Translator Toolkit
.
We
presented these results
last Saturday, July 10, at Wikimania 2010 in Gdańsk, Poland. We look forward to continuing to support the creation of the world’s largest encyclopedia and we can’t wait to work with Wikipedians and volunteers to create more content worldwide.
Posted by Michael Galvez, Product Manager
Labels
alpha languages
Android
api
BBC
Challenge
chrome
football
Google Goggles
Google Translate
Google Translate for Animals
I/O
integrations
Mobile
new languages
partnerships
Research
search
Search Stories
text-to-speech
toolbar
Translate Blog
Translate Community
translation quality
Translator Toolkit
transliteration
Wear
website translation element
Wikipedia
Youtube
Archive
2016
May
Apr
Feb
2015
Dec
Oct
Aug
Jul
Jun
May
Apr
Feb
Jan
2014
Dec
Oct
Jul
2013
Dec
Nov
Sep
Aug
Jul
May
Apr
Mar
Feb
2012
Dec
Oct
Sep
Aug
Jul
May
Apr
Mar
Feb
Jan
2011
Nov
Oct
Aug
Jun
May
Apr
Feb
Jan
2010
Dec
Nov
Oct
Aug
Jul
Jun
May
Apr
Mar
Feb
2009
Dec
Nov
Feed
Follow @google
Follow
Useful Links
About Translate
Translate Community
Translate for Android
Translate for iOS
Give us feedback in our
Product Forums
.