Finding most liked item of a user with rating valu

2019-09-14 17:52发布

问题:

Let's assume that a user votes for some movies in a scale of 1 to 5. These movies has genre info, and a movie can have more than one genre. Like this:

Movie A Rating 4
Action/Sci-Fi

Movie B Rating 5
Comedy/Action

Movie C Rating 4
Comedy/Drama

We want to learn which genre likes our user. Here we have our result set:

Genre Movie_Count Average_Rating

----------
Action 2 5
Comedy 2 4.5
SciFi 1 4
Drama 1 4

Obviously, we cannot predict anything with such a small resultset, but let us assume that we've a larger dataset.

Using this data, how can we sort most liked genres of this user? Simply calculating weighted average or something more complex?

回答1:

The main problem I see here is:

User rates 1000 comedy movies with average score of 4

User rates 10 action movies with average score of 4.1

How do you order them?

See http://www.evanmiller.org/how-not-to-sort-by-average-rating.html for discussion and one possible solution.

Another problem would be:

If a movie is both comedy and action, and was given a rating of 4.0, how much was it because it is comedy or action ?

You can solve this using expectation maximization http://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm .