Can Artificial Intelligence Decipher Lost Languages? Researchers Attempt to Decode 3500-Year-Old Ancient Languages

Image by Olaf Tausch via Wiki­me­dia Com­mons

We may not see warp dri­ves any time soon, but anoth­er piece of Star Trek tech, the uni­ver­sal trans­la­tor, may become a real­i­ty in our life­time, if it hasn’t already. Machine learn­ing “has proven to be very com­pe­tent” when it comes to trans­la­tion, “so much so that the CEO of one of the world’s largest employ­ers of human trans­la­tors has warned that many of them should be fac­ing up the stark real­i­ty of los­ing their job to a machine,” writes Bernard Marr at Forbes.

But the fact that AI can do things humans can does­n’t mean that it does those things well. One Google researcher put the case plain­ly in an inter­view with Wired: “Peo­ple naive­ly believe that if you take deep learn­ing and… 1,000 times more data, a neur­al net will be able to do any­thing a human being can do, but that’s just not true.” AI trans­la­tors have advanced sig­nif­i­cant­ly in the past few years, with Google’s Trans­la­totron pro­to­type (yes, that’s its real name), promis­ing to inter­pret “tone and cadence.” Still, AI trans­la­tions are often stilt­ed, awk­ward, and occa­sion­al­ly incom­pre­hen­si­ble approx­i­ma­tions that no human would come up with.

Does AI’s lim­i­ta­tions with liv­ing lan­guage hin­der its abil­i­ty to deci­pher very long dead ones, whose orthog­ra­phy, gram­mar, and syn­tax have been com­plete­ly lost? Yuan Cao from Google’s AI lab and Jiaming Luo and Regi­na Barzi­lay from MIT put machine learn­ing to the test when they devel­oped a “sys­tem capa­ble of deci­pher­ing lost lan­guages.” They took a very dif­fer­ent approach “from the stan­dard machine trans­la­tion tech­niques,” reports the MIT Tech­nol­o­gy Review, using less data instead of more, a tech­nique they call “min­i­mum-cost flow.”

The researchers test­ed their trans­la­tion machine on both the 3500-year-old Lin­ear B and Ugarit­ic, an ancient form of Hebrew, both of which have already been deci­phered by peo­ple. Still, the AI was “able to trans­late both lan­guages with remark­able accu­ra­cy,” with a rate of 67.3% in the trans­la­tion of cog­nates in Lin­ear B. The far old­er Bronze Age Minoan script Lin­ear A, how­ev­er (see it at the top), “one of the ear­li­est forms of writ­ing ever dis­cov­ered… is con­spic­u­ous for its absence.” No human has yet been able to deci­pher it.

A lost lan­guage trans­la­tor machine that only works on lan­guages that have already been trans­lat­ed (it needs pre­ex­ist­ing data on the prog­en­i­tor lan­guage to func­tion) may not seem par­tic­u­lar­ly use­ful. Then again, it could be one step in the direc­tion of what the authors call the “auto­mat­ic deci­pher­ment of lost lan­guages,” those that humans can’t already work out on their own. Read the paper “Neur­al Deci­pher­ment via Min­i­mum-Cost Flow: From Ugarit­ic to Lin­ear B” at arX­iv.

via MIT Tech­nol­o­gy Review

Relat­ed Con­tent:  

Arti­fi­cial Intel­li­gence May Have Cracked the Code of the Voyn­ich Man­u­script: Has Mod­ern Tech­nol­o­gy Final­ly Solved a Medieval Mys­tery?

Arti­fi­cial Intel­li­gence for Every­one: An Intro­duc­to­ry Course from Andrew Ng, the Co-Founder of Cours­era

Arti­fi­cial Intel­li­gence Iden­ti­fies the Six Main Arcs in Sto­ry­telling: Wel­come to the Brave New World of Lit­er­ary Crit­i­cism

Josh Jones is a writer and musi­cian based in Durham, NC. Fol­low him at @jdmagness


by | Permalink | Comments (1) |

Sup­port Open Cul­ture

We’re hop­ing to rely on our loy­al read­ers rather than errat­ic ads. To sup­port Open Cul­ture’s edu­ca­tion­al mis­sion, please con­sid­er mak­ing a dona­tion. We accept Pay­Pal, Ven­mo (@openculture), Patre­on and Cryp­to! Please find all options here. We thank you!


Leave a Reply

Quantcast