As we pointed out in a previous article on Facebook's artificial intelligence, the social media giant is having a difficult time translating user generated content. People tend to post as they talk on social media, and no machine translation software has been able to adequately understand and translate this, prompting Facebook to look further for a better solution.
On August 28th, Facebook filed a patent for "Machine Learning Dialect Identification" with the U.S. Patent & Trademark Office. This smart language dialect identification system will create classifiers for language dialects. These rules categorize how different words are used and as the machine recognizes them, it creates a dialect-specific language module which will allow it to more accurately translate slang and colloquialisms.
Previously we discussed how Arabic, specifically, presented problems due to the many dialects spoken across the Arab world, and also the poetic nature of the language.
Stepconference.com, a tech and interactive group in the Middle East and North Africa (MENA) region, says the current Facebook translation button for Arabic cannot translate any Arabic dialects other than Modern Standard Arabic (MSA).
In the patent application, it is noted that traditional speech recognition and machine translations systems for Arabic focus on MSA and don't account for other Arabic dialects, which differ from MSA syntactically, morphologically, lexically and phonologically. The patent author notes that speech recognition and machine translation systems cannot adequately recognize or translate content items to or from non-MSA dialects.
A way to better translate Arabic dialects is to identify the Arabic country the comment or web entry is posted in, linking the post to a specific dialect. Or, an online article or post can be identified as a specific dialect based on user interaction with the content. For example, if an article is rated by users that are known as using an identified Arabic dialect, the module can determine that the online article is in that dialect.
Facebook is also hoping to engage crowdsourcing to augment the training data set. The system will send content items and classification results to users who can respond to confirm whether the classification is correct, or rank it on accuracy.
As companies expand globally, leveraging international social media is vital but will only be useful if the content is accurately localized. Let's see if Facebook's new patent gets this right.
Further Resources from GPI
You may gain further insight into global e-business, global SEO and website translation and country specific cultural facts and related topics by reviewing some previous blogs written by GPI:
- News: The EU is Using Translation to Achieve European Digital Single Market
- Website Translation Tips and Best Practices by Country Series
- Language and Locale Quick Facts eBooks
- Language Translation Resources
- Translation Portal and Localization Tools
- Creating Culturally Customized Content for Website Translation
Please feel free to contact GPI at email@example.com with any questions about our translation services. Also let us know if you have any interesting blog topics you would like us to cover in future blogs. You may also request a complimentary Translation Quote for your projects as well.
About the Author
Natalie was born and raised in Montana where she graduated from The University of Montana with a degree in Business Administration. Her international experience includes two summer programs, one at The European Business School in Germany and the other at The University of Brescia in Italy. She studied a variety of global business subjects including international business, trade, culture and language. Key projects for her undergrad studies included meeting with executives from large corporations such as Lufthansa, Opel and The European Central Bank as well as working with the design team on the marketing plan for the 2015 World Fair in Milan, Italy. She has a range of global event management experience including organization of the Annual Mansfield Conference on the Middle East and the China Town Hall meeting series. Her hobbies include beading, yoga, cooking, reading, being outdoors and traveling.More Content by Natalie Williams