Understanding the Pearson Correlation Coefficient

2019-05-23 12:12发布

As part of the calculations to generate a Pearson Correlation Coefficient, the following computation is performed:

enter image description here

In the second formula: p_a,i is the predicted rating user a would give item i, n is the number of similar users being compared to, and ru,i is the rating of item i by user u.

What value will be used if user u has not rated this item? Did I misunderstand anything here?

标签： recommendation-engine

2条回答

祖国的老花朵

2楼-- · 2019-05-23 12:23

According to the link, earlier calculations in step 1 of the algorithm are over a set of items, indexed 1 to m, whe m is the total number of items in common.

Step 3 of the algorithm specifies: "To find a rating prediction for a particular user for a particular item, first select a number of users with the highest, weighted similarity scores with respect to the current user that have rated on the item in question."

These calculations are performed only on the intersection of different users set of rated items. There will be no calculations performed when a user has not rated an item.

0人赞添加讨论(0) 举报

淡お忘

3楼-- · 2019-05-23 12:36

It only makes sense to calculate results if both users have rated a movie. Linear regression can be visualised as a method of finding a straight line through a two-dimensional graph where one variable is plotted on the X axis and another one - on Y axis. Each combination of ratings is represented as a point on an euclidean plane [u1_rating, u2_rating]. Since you can not plot points which only have one dimension to them, you'll have to discard those cases.

0人赞添加讨论(0) 举报

Understanding the Pearson Correlation Coefficient

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间