Once upon a midnight dreary, long we pondered weak and weary,
Over many a quaint and curious volume of translation lore.
When our system does translation, lifeless prose is its creation;
Making verse with inspiration no machine has done before.
So we want to boldly go where no machine has gone before.
Quoth now Google, "Nevermore!"
Robert Frost once said, “Poetry is what gets lost in translation”. Translating poetry is a very hard task even for humans, and is clearly beyond the capability of current machine translation systems. We therefore, out of academic curiosity, set about testing the limits of translating poetry and were pleasantly surprised with the results!
We will present a paper on poetry translation at the EMNLP conference this year. In this paper, we investigate the purely technical challenges around generating translations with fixed rhyme and meter schemes.
The value of preserving meter and rhyme in poetic translation has been highly debated. Vladimir Nabokov famously claimed that, since it is impossible to preserve both the meaning and the form of the poem in translation, one must abandon the form altogether. Another authority (and for us, computer scientists, perhaps the more familiar one), Douglas Hofstadter argues that preserving the form is very important to maintaining the feeling and the sound of a poem. It is in this spirit that we decided to experiment with translating not only poetic meaning, but form as well.
A Statistical Machine Translation system, like Google Translate, typically performs translations by searching through a multitude of possible translations, guided by a statistical model of accuracy. However, to translate poetry, we not only considered translation accuracy, but meter and rhyming schemes as well. In our paper we describe in more detail how we altered our translation model, but in general we chose to sacrifice a little of the translation’s accuracy to get the poetic form right.
As a pleasant side-effect, the system is also able to translate anything into poetry, allowing us to specify the genre (say, limericks or haikus), or letting the system pick the one it thinks fits best. At the moment, the system is too slow to be made publicly accessible, but we thought we’d share some excerpts:
So here's the dear child under land,
will not reflect her beauty and
besides the Great, no alter dark,
the pure ray, fronts elected mark.
These words compassion forced the small to lift her head
gently and tell him to whisper: “I'm not dead."
“Well, gentle soul”, said
Love, “say whatever you please,
for I want to hear.”
More examples and technical details can be found in our research paper (as well as clever commentary).
Posted by Dmitriy Genzel, Software Engineer