Google pointed out that this update relies on the PaLM 2 AI language model. According to Google researcher Isaac Caswell, the PaLM 2 AI model performs particularly well when learning highly relevant languages, such as Awadhi and Marwadi, which are related to Hindi, and languages ​​close to the French family. French-based creole languages, such as Seychellois Creole and Mauritius Creole (Morisien).

Google also understands Cantonese

In this wave of new languages, Google Translate also supports Cantonese. Cantonese "has long been one of the most popular requested languages ​​for Google Translate," Caswell said. However, Cantonese often overlaps with Mandarin in writing, so finding the right data and training the model is a challenge.

粵語.jpgSource: Google Google Translate also supports Cantonese.

In addition, Caswell pointed out that about a quarter of the new languages ​​are from Africa, showing Google’s emphasis on promoting the digitization of African languages.

Caswell revealed in an interview that most of the new languages ​​have at least one million users, and "some languages ​​have hundreds of millions of users." The inclusion of these languages ​​expands the scope of use of Google Translate and enhances its usefulness in multilingual environments.

What is PaLM 2? How strong is it?

Google said that the technical support behind this language expansion mainly comes from the powerful learning capabilities of the PaLM 2 AI language model. This model not only effectively learns and understands new languages, but also establishes connections between related languages, thereby improving the accuracy and naturalness of translations.

PaLM 2 is the second-generation large-scale language training model (LLM) released by Google in 2023. At that time, Google pointed out that compared with the first-generation PaLM, the second-generation processing capabilities have been greatly improved in fields such as mathematics, logical reasoning, and coding.

Google PaLM 2 can also be subdivided into 4 versions, from large to small, namely "Unicorn, Bison, Otter, and Gecko". They can be used on different types of devices, and the lightweight Gecko can even be used offline directly on mobile devices.

Google said that PaLM 2 uses more than 100 languages ​​​​for training, and is mainly good at understanding and generating natural language, translation, coding, question and answer, summarization, creative writing, mathematical logic, and common sense reasoning; especially the semantic understanding part, PaLM 2 can understand things like They are non-literal words such as riddles and idioms.

  • This article is reprinted with permission from: "Digital Age"