Apache Spark ALS - how to perform Live Recommendat

2020-07-10 09:13发布

I am using Apache Spark (Pyspark API for Python) ALS MLLIB to develop a service that performs live recommendations for anonym users (users not in the training set) in my site. In my usecase I train the model on the User ratings in this way:

from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel, Rating
ratings = df.map(lambda l: Rating(int(l[0]), int(l[1]), float(l[2])))
rank = 10 
numIterations = 10
model = ALS.trainImplicit(ratings, rank, numIterations)

Now, each time an anonym user selects an item in the catalogue, I want to fold-in its vector in the ALS model and get the recommendations (just like the recommendProducts() call), but avoiding the re-training of the whole model.

Is there any way to easily do the fold-in of the new anonym user vector after training the ALS model in Apache Spark?

Thanks in advance

1条回答
看我几分像从前
2楼-- · 2020-07-10 10:05

There are a few Open Source "model server" solutions that I have seen advertised, but have no hands-on experience on. I also heard of a commercial offering, but can't just remember the name right now.
So make your own opinion, and keep a watch on possible alternatives.

PredictionIO (the start-up has been gobbled by SalesForce but their solution is still available) uses a Spark+Hadoop+HBase stack, plus some kind of web server component.

MLeap is yet-another-ML-library-with-limited-feature-set, which can be plugged into Spark/Scikit-Learn/whatever, and can spawn a web service -- or export your model to a hosted solution named Combust.ml

MLDB is yet-another-ML-library-with-limited-feature-set, completely outside of the Python/Spark ecosystem, but claims full integration with TensorFlow -- including the ability to import existing Deep Learning models and tweak them for different uses.

查看更多
登录 后发表回答