I've always been curious as to how these systems work. For example, how do netflix or Amazon determine what recommendations to make based on past purchases and/or ratings? Are there any algorithms to read up on?
Just so there's no misperceptions here, there's no practical reason for me asking. I'm just asking out of sheer curiosity.
(Also, if there's an existing question on this topic, point me to it. "Recommendations system" is a difficult term to search for.)
At it's most basic, most recommendation systems work by saying one of two things.
User-based recommendations:
If User A likes Items 1,2,3,4, and 5,
And User B likes Items 1,2,3, and 4
Then User B is quite likely to also like Item 5
Item-based recommendations:
If Users who purchase item 1 are also disproportionately likely to purchase item 2
And User A purchased item 1
Then User A will probably be interested in item 2
And here's a brain dump of algorithms you ought to know:
- Set similarity (Jaccard index & Tanimoto coefficient)
- n-Dimensional Euclidean distance
- k-means algorithm
- Support Vector Machines
There're mainly two types of recommender systems, which work differently:
1. Content-based. These systems make recommendations based on characteristic information. This is information about the items (keywords, categories, etc.) and users (preferences, profiles, etc.).
2. Collaborative filtering. These systems are based on user-item interactions. This is information such as ratings, number of purchases, likes, etc.
This article (published by the company I work at) provides an overview of the two systems, some practical examples, and suggests when it makes sense to implement them.
Ofcourse there is algorithms that will recommend you with prefered items. Different data mining techniques have been implemented for that. If you want more basic details on Recommender System then visit this blog. Here every basics has been covered to know about Recommender System.
GroupLens Research at the University of Minnesota studies recommender systems and generously shares their research and datasets.
Their research expands a bit each year and now considers specifics like online communities, social collaborative filtering, and the UI challenges in presenting complex data.
This is a classification problem - that is, the classification of users into groups of users who are likely to be interested in certain items.
Once classified into such a group, it is easy to examine the purchases/likes of other users in that group and recommend them.
Therefore, Bayesian Classification and neural networks (multilayer perceptrons, radial basis functions, support vector machines) are worth reading up on.
The Netflix algorithm for its recommendation system is actually a competitive endeavor in which programmers continue to compete to make gains in the accuracy of the system.
But in the most basic terms, a recommendation system would examine the choices of users who closely match another user's demographic/interest information.
So if you are a white male, 25 years old, from New York City, the recommendation system might try and bring you products purchased by other white males in the northeast United States in the age range of 21-30.
Edit: It should also be noted that the more information you have about your users, the more closely you can refine your algorithms to match what other people are doing to what may interest the user in question.