Samsung Analysis in Vietnam is a part of a sequence in regards to the folks and improvements permitting cellular AI to boost extra lives
Samsung is pioneering premium cellular AI experiences. To learn the way Galaxy AI is maximizing the potential of its customers, we’re visiting Samsung Analysis facilities world wide. Now supporting 16 languages, Galaxy AI is enabling extra folks to develop their language capabilities, even when offline, because of on-device translation in options corresponding to Reside Translate, Interpreter, Be aware Help and Shopping Help. We lately visited Jordan to be taught the complexities of creating an AI mannequin for Arabic, a language with many dialects. This time, we’re going to Vietnam to discover how knowledge is ready to coach AI fashions.
What’s the distinction between a ghost, grave and mom in Vietnamese? For a language spoken by 97 million folks worldwide, little or no. Every phrase interprets to “ma,” “mả” and “má,” respectively — and may solely be distinguished by tone. This illustrates how tough it may be for AI fashions to be taught a language, contemplating they can not acknowledge firsthand the context and feelings of conversations nor the intentions of these talking.
Samsung R&D Institute Vietnam (SRV) used finely refined knowledge to assist its AI mannequin correctly acknowledge even probably the most delicate variations in language.
The standard of knowledge used immediately impacts the accuracy of computerized speech recognition (ASR), neural machine translation (NMT) and text-to-speech (TTS) — processes that assist Galaxy AI options corresponding to Reside Translate, Interpreter, Chat Help and Shopping Help break down language boundaries.
A Storm of Challenges
“Vietnamese is a fancy and various language with wealthy expressions, a lot of that are difficult to seize,” says Ngô Hồng Thái, NMT lead at SRV. Of the 16 languages that Galaxy AI helps, Vietnamese was notably tough to develop.
“Personally, creating an AI mannequin for Vietnamese was extra daunting than our typhoons!” he provides earlier than explaining the hurdles confronted in the course of the improvement course of.
Vietnamese is a tonal language with six distinct tones. As evident within the “ma” instance above, small nuances in vocalization can drastically alter the meanings of phrases. Subsequently, a meticulous and detailed method was needed.
“When comparable sounding phrases are damaged down, one phrase consists of a number of brief segments, or ‘body units’,” says Bui Ngoc Tung, ASR lead at SRV. “The AI mannequin differentiates between the brief audio frames of round 20 milliseconds to acknowledge what phrases correspond to a sure set of consecutive frames. As such, it’s vital to place nice effort into the early phases of the AI studying course of.”
Moreover, homophones and homonyms are widespread in Vietnamese. Folks can usually depend on context and nonverbal components in conversations to distinguish between phrases that sound the identical or are written the identical however have totally different meanings. Nevertheless, AI fashions must be taught to precisely determine and differentiate between tones and comparable phrases.
“This isn’t a simple job,” Thái explains. “Other than the quantity, the info must be correct to make sure it’s able to recognizing the linguistic nuances that exist in Vietnamese.”
Rigorous Preparation
The information refinement course of consists of three steps. First, the audio and textual content used to coach the AI mannequin have to be reviewed and corrected. Then, this dataset goes by random checks for total high quality. Lastly, the dataset is normalized and cleaned earlier than use in coaching.
“We totally carried out a sequence of exams to examine the accuracy of our dataset,” says Nguyen Manh Duy, TTS lead at SRV who oversees database creation. “We confronted quite a lot of surprising issues together with misspelled phrases in scripts and background noise or incorrect pronunciation throughout audio recordings. We spent important time refining and enhancing our coaching knowledge.”
Along with the distinctive linguistic challenges in Vietnamese, there’s a lack of universally accessible knowledge in comparison with extra extensively spoken languages. “That is one more reason why the info refinement stage is so vital,” he provides. “Since we had restricted sources, each piece of knowledge needed to be absolutely dependable. There was no margin for error.”
Furthermore, the AI mannequin for Vietnamese should think about each tonal and regional variations. To enhance the AI mannequin’s accuracy, the staff collected huge quantities of knowledge with Vietnam’s northern, central and southern accents — leading to an unlimited quantity of knowledge to refine and confirm.
Continued Enchancment
Builders at SRV accomplished the undertaking after months of onerous work, and Vietnamese turned one of many first languages to be supported by Galaxy AI. Regardless of this success, the staff is ceaselessly working to enhance the Vietnamese Galaxy AI expertise.
“We’re persevering with to boost the AI mannequin by incorporating consumer suggestions in regards to the relevance of phrases and phrases in Galaxy AI,” says Tran Tuan Minh, chief of the AI language improvement undertaking at SRV. “We have now simply taken our first steps right into a extra open world — and we’ve got a lot extra to discover collectively.”
Within the subsequent episode of The Studying Curve, we are going to head to China to dig into how AI fashions are educated and fine-tuned.