Ensemble Learning: Live on TV’s Jeopardy and Behind the Scenes on Leading Retail Websites
Starting today, IBM’s Watson supercomputer will go up against a pair of human Jeopardy champions. Regardless of whether man or machine comes out on top, it will be a banner day for a machine learning technique called ensemble learning. Ensemble learning is based on the notion that tens or hundreds of independent algorithms, each aimed and working in a particular kind of context are better than one bigger, more complex algorithm. This idea lies at the heart of both Watson’s approach to generating Jeopardy questions and RichRelevance’s approach to generating relevant product recommendations and advertising.
When Watson is given a Jeopardy answer, over 100 different question generators jump to life, trying to come up with questions that match the answer. Each does its work completely independently, using its own particular approach from the annals of Artificial Intelligence. Some techniques are particularly good at differentiating between different meanings of the same word. Their focus is on deciding whether “plant” means the thing in a flower pot or a place where cars are manufactured. Others are not quite as good at that problem, but are excellent at understanding what concepts various pronouns refer to. Some questions depend critically on solving one of these problems while others require the other one. Many other Jeopardy challenges require similarly specialized linguistic skills, just not the particular two mentioned above. Rather than try to solve all these problems at once, ensemble learning runs each technique independently and then lets those that are confident in the question they have come up with for a given answer suggest it. Other algorithms might not be confident enough to submit an answer at all. The ensemble machinery then chooses from among the proposed answers based on criteria like how well each algorithm generally does and how confident each is in its proposal. All of this happens in a matter of seconds, as it must in order to compete with human players.
At RichRelevance, our RichRecs product recommendation algorithms participate in a similar ensemble. Some algorithms are focused on finding accessories for the product a shopper is browsing. Others focus on finding alternatives the shopper might want to consider. Still others dig back into their past purchase history to look for clues as to what interests them. The ensemble then carefully selects among them based on the shopper’s current context and how the various algorithms have performed in that context in the past. This is all done in less than a tenth of a second, so that recommendations appear on a shopper’s screen right as the rest of the page does.
While Watson and RichRecs are solving very different kinds of problems, they both benefit greatly from their use of ensemble learning. Without ensemble learning, Watson was only able to answer about 15% of Jeopardy questions. With it, that number is over 90%. Similarly, RichRecs’ ability to drive incremental sales via an ensemble is significantly greater than that which any one underlying algorithm can accomplish on its own.