Saturday, January 12, 2013
When machine learning goes wrong
One other part that is difficult is slang, street language, the language used by certain groups of people and the meaning that it has. This was already adressed for a bit in the video from Michael Karasick, VP and Director at IBM research, in my post big data sentiment analysis.
Now IBM has tried to have Watson understand slang language. Watson is an artificial intelligence computer system capable of answering questions posed in natural language, developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first president, Thomas J. Watson. The machine was specifically developed to answer questions on the quiz show Jeopardy!. In 2011, Watson competed on Jeopardy against former winners Brad Rutter, and Ken Jennings. Watson received the first prize of $1 million.
"In order to boost Watson’s aptitude with everyday lingo, its software engineers began teaching it the Urban Dictionary, that massive online depository of current slang. There was just one small problem: Watson couldn’t differentiate “clean” terms from profanity. It’s one thing when a supercomputer uses “OMG,” but quite another when it starts cursing like a Quentin Tarantino character—hopefully not in the middle of a “Jeopardy” episode, although that could prove memorable for everyone involved." From slashdot
This shows that understanding slang and everyday lingo is difficult however even more difficult is having a algorithm to understand when it is appropriate to use it and when not. This research field will be very interesting in the upcoming years and will prove to be challenging however, when mastered it can be a big addition to how we communicate with technology.