Item-to-item collaborative filtering, how to manag

2019-09-14 16:46发布

站内文章 / 前端开发

37 0

爷、活的狠高调

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am working on a recommendation engine and one problem I am facing right now is the similarity matrix of items are huge.

I calculated similarity matrix of 20,000 items and stored them a a binary file which tuned out to be nearly 1 GB. I think it is too big.

what is the best way do deal with similarity matrix if you have that many items?

Any advice!

回答1:

In fact similarity matrix is about how object similar to another objects. Each row consist of neighbors of object(row id), but you don't need to store all of neighbors, store for example only 20 neighbors. Use lil_matrix: from scipy.sparse import lil_matrix

标签： recommendation-engine