News: Facebook Files Patent for Machine Learning Dialect Identification

October 2, 2016 Natalie Williams

As we pointed out in a previous article on Facebook's artificial intelligence, the social media giant is having a difficult time translating user generated content. People tend to post as they talk on social media, and no machine translation software has been able to adequately understand and translate this, prompting Facebook to look further for a better solution.

gpi-facebook patent-home

On August 28th, Facebook filed a patent for "Machine Learning Dialect Identification" with the U.S. Patent & Trademark Office. This smart language dialect identification system will create classifiers for language dialects. These rules categorize how different words are used and as the machine recognizes them, it creates a dialect-specific language module which will allow it to more accurately translate slang and colloquialisms.

Previously we discussed how Arabic, specifically, presented problems due to the many dialects spoken across the Arab world, and also the poetic nature of the language.

Stepconference.com, a tech and interactive group in the Middle East and North Africa (MENA) region, says the current Facebook translation button for Arabic cannot translate any Arabic dialects other than Modern Standard Arabic (MSA).

In the patent application, it is noted that traditional speech recognition and machine translations systems for Arabic focus on MSA and don't account for other Arabic dialects, which differ from MSA syntactically, morphologically, lexically and phonologically. The patent author notes that speech recognition and machine translation systems cannot adequately recognize or translate content items to or from non-MSA dialects.

A way to better translate Arabic dialects is to identify the Arabic country the comment or web entry is posted in, linking the post to a specific dialect. Or, an online article or post can be identified as a specific dialect based on user interaction with the content. For example, if an article is rated by users that are known as using an identified Arabic dialect, the module can determine that the online article is in that dialect.

Facebook is also hoping to engage crowdsourcing to augment the training data set. The system will send content items and classification results to users who can respond to confirm whether the classification is correct, or rank it on accuracy.

As companies expand globally, leveraging international social media is vital but will only be useful if the content is accurately localized. Let's see if Facebook's new patent gets this right.

Further Resources from GPI

You may gain further insight into global e-business, global SEO and website translation and country specific cultural facts and related topics by reviewing some previous blogs written by GPI:

Please feel free to contact GPI at info@globalizationpartners.com with any questions about our translation services.  Also let us know if you have any interesting blog topics you would like us to cover in future blogs. You may also request a complimentary Translation Quote for your projects as well.

 

About the Author

Natalie Williams

Global Digital Marketing Manager. Natalie was born and raised in Montana where she graduated from The University of Montana with a degree in Business Administration. Her international experience includes two summer programs, one at The European Business School in Germany and the other at The University of Brescia in Italy. She studied a variety of global business subjects including international business, trade, culture and language. Key projects for her undergrad studies included meeting with executives from large corporations such as Lufthansa, Opel, and The European Central Bank as well as working with the design team on the marketing plan for the 2015 World Fair in Milan, Italy. She has a range of global event management experience including organization of the Annual Mansfield Conference on the Middle East and the China Town Hall meeting series. Her hobbies include yoga, cooking, reading, being outdoors and traveling.

More Content by Natalie Williams
Previous Article
Why Businesses Should Consider Translating eBooks
Why Businesses Should Consider Translating eBooks

The creation, translation, publishing and of course, the global sales of eBooks, has exploded in the last y...

Next Article
Diaspora: The Dispersion of People from Their Homeland
Diaspora: The Dispersion of People from Their Homeland

The word diaspora comes from the Greek word "διασπορά", meaning "scattering, dispersion". Diaspora describe...

Ready to translate your documents, software or website?

Request Quote!