Latest news in the field of machine translation
Let’s discuss the latest news in the field of machine translation. I have highlighted the news that I consider the most important.
SLAIT pivots from translating sign language to AI-powered interactive lessons
Millions of people use sign language, but the methods of teaching this complex and subtle skill have not evolved as quickly as those for written and spoken languages. SLAIT School aims to change that with an interactive tutor powered by computer vision, letting aspiring ASL speakers practice at their own rate like in any other language-learning app. The system includes a real-time video chat and translation tool that can recognize most common signs and help an ASL speaker communicate more easily with someone who does not know the language. For more information, visit https://slait.school.
However, early successes slowed as the team realized they needed more time, money and data than they were likely to get, Evgeny Fomin, CEO and co-founder of SLAIT, explained, “We got great results in the beginning, but after several attempts, we realized that, right now, there just is not enough data to provide full language translation. We had no opportunity for investment, no chance to find our supporters, because we were stuck without a product launch — we were in limbo. Capitalism… is hard. We actively communicate with users and try to find the best prices and economic model that makes subscription plans affordable. We would really like to make the platform free, but so far, we have not found an opportunity for this yet. Because this is a very niche product… we just need to make a stable economic model that will work in general,” he said. The traction is also a flywheel for the company’s tech and content. By collecting information (with express opt-in consent, which he says the community is quite happy to provide), they can expand and improve the curriculum, and continue to refine their gesture recognition engine.
Google announces PaLM 2, its answer to GPT-4
PaLM 2 can code, translate, and reason in ways better than GPT-4, says Google. More info at https://ai.google/discover/palm2
PaLM 2 is a family of fundamental language models comparable to OpenAI’s GPT-4. At its Google I/O Mountain View event in California, Google revealed that it already uses PaLM 2 to power 25 products, including its Bard conversational AI assistant. PaLM 2 has been trained on an enormous volume of data and can predict the next word, generating the most likely text based on human prompts. With support for over 100 languages and PaLM 2 can perform “reasoning”, code generation, and multi-lingual translation. During his 2023 Google I/O keynote, Google CEO Sundar Pichai said that PaLM 2 comes in four sizes: Gecko, Otter, Bison, and Unicorn. Gecko is the smallest and can operate on a mobile device. Aside from Bard, PaLM 2 is behind AI features in Docs, Sheets, and Slides.
While PaLM 2 appears impressive, it is worth exploring how it compares to GPT-4. According to the PaLM 2 technical report, PaLM 2 outperforms GPT-4 in some areas like math, translation, and reasoning. However, it’s important to acknowledge that large language model datasets, including PaLM 2, may contain copyrighted material used without permission, as well as potentially harmful content scraped from the Internet. Training data significantly influences the output of any AI model. Consequently, some experts have been advocating the use of open datasets that can provide opportunities for scientific reproducibility and ethical scrutiny.
Google Teases ‘Universal Translator’ for Dubbing at Developer Conference
The keynote of Google’s May 10, 2023, live-streamed annual conference, Google I/O ’23, broadly covered the expected topics, including Bard, Search, Cloud, Android, and Hardware. For more information, visit https://slator.com/google-teases-universal-translator-dubbing-developer-conference/.
However, it was during the presentation on Responsible AI that James Manyika, Senior VP of Technology and Society, introduced the latest silver bullet for dubbing: the Universal Translator. Manyika described Universal Translator as “an experimental AI video dubbing service that helps experts translate a speaker’s voice while also matching their lip movements”— though he stopped short of naming which experts might participate, and in what capacity. Manyika demonstrated the tool by playing a clip of an online college course in the original English, followed by the same clip dubbed with Spanish audio, with the speaker’s lips moving in concert with the translated words. Beyond breathless headlines about the Universal Translator, specialized companies, such as AppTek, and Google’s competitors, including Amazon, have also been working on this challenge, known variously as automatic dubbing, machine dubbing, or AI dubbing. For now, the Universal Translator’s most novel contribution seems to be the advance in “lip matching”.