Summary
- Google Translate has added 110 new languages, covering an estimated 614 million speakers, making it the app’s largest language expansion yet.
- This is an extension of Google Translate’s 1,000 Languages Initiative, which was announced in late 2022.
- Some newly supported languages include Afar, Cantonese, Manx, NKo, Punjabi (Shahmukhi), Tamazight (Amazigh), and Tok Pisin.
Thanks to excellent translation tools like Google Translate, language barriers are quickly becoming a thing of the past. The company announced the 1,000 Languages Initiative back in late 2022, aiming to support the thousand most spoken languages in the world using artificial intelligence. Google Translate is now taking another big step towards that objective, with the team announcing the inclusion of 110 new languages spanning multiple continents and several countries.
What is Google PaLM 2?
Google’s PaLM 2 LLM is fueling Bard, Google Assistant, and much more: Here’s what’s going on
Google calls this the service’s “largest expansion ever,” with the PaLM 2 large language model (LLM) playing a big part in bringing these languages to the desktop and mobile apps. This expansion covers roughly 614 million speakers worldwide, with roughly a quarter of the newly included languages coming from the African continent.
Some of the languages supported via this expansion include Afar, Cantonese, Manx, NKo, Punjabi (Shahmukhi), Tamazight (Amazigh), and Tok Pisin. While Google didn’t provide a full list of the newly added languages, a company support page has published the updated list.
Google says that some of these languages barely have any native speakers, while others have up to 100 million speakers. Using the Manx language as an example, Google explains how it was on the verge of extinction some decades ago but now has thousands of speakers due to subsequent revival efforts. Meanwhile, English speakers are encouraged to check out the Tok Pisin language on Google Translate, an English-based Creole language used in Papua New Guinea, which seemingly has a few familiar words.
Identifying new languages is a complex process
Google acknowledges the challenges of identifying and supporting new languages as they can vary based on multiple factors, such as dialects or spelling. Given these complexities, Google says it picks the common language varieties that are in use while also adopting other elements from a related language or dialect.
Things get somewhat trickier when two languages are fairly similar. Google talks about Seychellois Creole and Mauritian Creole, which are French-based Creole languages, while also citing Awadhi and Marwadi, which are related to Hindi. All these languages are now available as part of this expansion.
The Translate team credits the PaLM 2 LLM for rolling out some of these languages while promising to work with native speakers and linguistics experts to support even more languages and their variations in the future. I can’t find the newly included languages on Google Translate for the web and the Android app as of yet, but that shouldn’t take long to change.