Google Translate Blog
The official source for news on Google's translation technologies
Endangered Languages to Endure on YouTube
Thursday, October 28, 2010
(Cross-posted from the
Many of the world's smallest and most endangered languages have no written form and have never been recorded or scientifically documented. Today, the
National Geographic Enduring Voices YouTube channel
will launch and allow many of these tongues to have a presence on the Internet for the very first time. Linguists Dr. K. David Harrison and Dr. Gregory Anderson from the
Living Tongues Institute
have teamed up with Google.org to allow small and endangered languages that may have never been heard outside of a remote village to reach a global audience. Using YouTube as a platform, researchers, academics and communities can now collaborate more effectively on promoting language revitalization.
The YouTube channel features videos such as
hip-hop performed by Songe Nimasow
in the Aka language of India,
songs by Aydyng Byrtan-ool
, a talented young Tuvan singer and epic storyteller in Southern Siberia, and videos demonstrating how the
Foe language of Papua New Guinea
uses body parts to count from 1 to 37.
The launch of the channel comes on the heels of
by Harrison and Anderson of a “hidden” language of India, known locally as Koro, that is new to science and had never been documented outside of its rural community. Koro is one of half of the world’s languages likely to vanish in the next 100 years.
In addition to using YouTube to help revitalize endangered and minority languages, communities can also take advantage of
Google Translator Toolkit
the addition of 284 new languages last year to make translation
faster and easier
In the midst of a language extinction crisis, we are also seeing a global grassroots movement for language revitalization. Speakers are leveraging new technologies, such as social networking and YouTube, to sustain small languages. As Harrison describes in his book "
The Last Speakers
," we are all impoverished when a language dies, and all enriched by the human knowledge base found in the world's smallest tongues.
Learn more about Harrison and Anderson's efforts to document languages through the
Enduring Voices Project
Posted by Kirsten Olsen Cahill, Product Marketing Manager, Google.org
Poetic Machine Translation
Tuesday, October 5, 2010
Once upon a midnight dreary, long we pondered weak and weary,
Over many a quaint and curious volume of translation lore.
When our system does translation, lifeless prose is its creation;
Making verse with inspiration no machine has done before.
So we want to boldly go where no machine has gone before.
Quoth now Google, "Nevermore!"
Robert Frost once said, “Poetry is what gets lost in translation”. Translating poetry is a very hard task even for humans, and is clearly beyond the capability of current machine translation systems. We therefore, out of academic curiosity, set about testing the limits of translating poetry and were pleasantly surprised with the results!
We will present a
paper on poetry translation
this year. In this paper, we investigate the purely technical challenges around generating translations with fixed
The value of preserving meter and rhyme in poetic translation has been highly debated.
famously claimed that, since it is impossible to preserve both the meaning and the form of the poem in translation, one must abandon the form altogether. Another authority (and for us, computer scientists, perhaps the more familiar one),
that preserving the form is very important to maintaining the feeling and the sound of a poem. It is in this spirit that we decided to experiment with translating not only poetic meaning, but form as well.
A Statistical Machine Translation system, like
, typically performs translations by searching through a multitude of possible translations, guided by a statistical model of accuracy. However, to translate poetry, we not only considered translation accuracy, but meter and rhyming schemes as well. In our paper we describe in more detail how we altered our translation model, but in general we chose to sacrifice a little of the translation’s accuracy to get the poetic form right.
As a pleasant side-effect, the system is also able to translate anything
poetry, allowing us to specify the genre (say,
), or letting the system pick the one it thinks fits best. At the moment, the system is too slow to be made publicly accessible, but we thought we’d share some excerpts:
A stanza from
Essai monographique sur les Dianthus des Pyrénées françaises
by Edouard Timbal-Lagrave and Eugène Bucquoy, translated to English as a pair of
So here's the dear child under land,
will not reflect her beauty and
besides the Great, no alter dark,
the pure ray, fronts elected mark.
translated as a couplet in
These words compassion forced the small to lift her head
gently and tell him to whisper: “I'm not dead."
Le Miroir des simples âmes
, an Old French poem by
, translated to Modern French by M. de Corberon, and then to
“Well, gentle soul”, said
Love, “say whatever you please,
for I want to hear.”
More examples and technical details can be found in our research
(as well as clever
Posted by Dmitriy Genzel, Software Engineer
Veni, Vidi, Verba Verti
Friday, October 1, 2010
[We’ve added Latin as an alpha language to
! Alpha languages aren’t perfect, but we think the addition will help unlock many classic Latin texts and documents. Learn more from our programmer Jakob in the post below. Don’t speak Latin? Good thing there is now an easy way to
translate the language
Ut munimenta linguarum convellamus et scientiam mundi patentem utilemque faciamus, Ut munimenta linguarum convellamus et scientiam mundi patentem utilemque faciamus, instrumenta convertendi multarum nationum linguas creavimus. Hodie nuntiamus primum instrumentum convertendi linguam qua nulli nativi nunc utuntur: Latinam. Cum pauci cotidie Latine loquantur, quotannis amplius centum milia discipuli Americani Domesticam Latinam Probationem suscipiunt. Praeterea plures ex omnibus mundi populis Latinae student.
Hoc instrumentum convertendi Latinam rare usurum ut convertat
intellegamus. Multi autem vetusti libri
lingua Latina scripti sunt. Libri enim vero multi milia in
sunt qui praeclaros locos Latinos habent.
Convertere instrumentis computatoriis ex Latina difficile est et intellegamus grammatica nostra non sine culpa esse. Autem Latina singularis est quia plurimi libri lingua Latina iampridem scripti erant et pauci novi posthac erunt. Multi in alias linguas conversi sunt et his conversis utamur ut nostra instrumenta convertendi edoceamus. Cum hoc instrumentum facile convertat libros similes his ex quibus edidicit, nostra virtus convertendi libros celebratos (ut Commentarios de Bello Gallico Caesaris) iam bona est.
Proximo tempore locum Latinum invenies vel auxilio tibi opus eris cum litteris Latinis, conare
Jakob Uszkoreit, Ingeniarius Programmandi et Ben Bayer, Magister Spatii et Temporis
Google Translate for Animals
website translation element
Translate for Android
Translate for iOS
Give us feedback in our
Official Google Blog
Inside Search Blog
Lat Long Blog
Around the world