Saturday, January 12, 2013

When machine learning goes wrong

Human emotions and understanding human emotions in text and speech is one of the most challenging fields in computer science at the moment. The ability to understand what a human means by a certain text or a certain sentence is challenging. Humans are able to pick up the subtle things like sarcasm or humor while developing an algorithm to understand this has proven to be a challenge. Being able to do so, having a system that understands this will provide the option to have a much more human like interaction with technology and will provide options to have machines make decisions based upon feelings of real people.

One other part that is difficult is slang, street language, the language used by certain groups of people and the meaning that it has. This was already adressed for a bit in the video from Michael Karasick, VP and Director at IBM research, in my post big data sentiment analysis.

Now IBM has tried to have Watson understand slang language. Watson is an artificial intelligence computer system capable of answering questions posed in natural language, developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first president, Thomas J. Watson. The machine was specifically developed to answer questions on the quiz show Jeopardy!. In 2011, Watson competed on Jeopardy against former winners Brad Rutter, and Ken Jennings. Watson received the first prize of $1 million.

"In order to boost Watson’s aptitude with everyday lingo, its software engineers began teaching it the Urban Dictionary, that massive online depository of current slang. There was just one small problem: Watson couldn’t differentiate “clean” terms from profanity. It’s one thing when a supercomputer uses “OMG,” but quite another when it starts cursing like a Quentin Tarantino character—hopefully not in the middle of a “Jeopardy” episode, although that could prove memorable for everyone involved." From slashdot

This shows that understanding slang and everyday lingo is difficult however even more difficult is having a algorithm to understand when it is appropriate to use it and when not. This research field will be very interesting in the upcoming years and will prove to be challenging however, when mastered it can be a big addition to how we communicate with technology. 

No comments: