User matching algorithm

So this problem we have users matching to other online users. However it is not just a one to one match. A user is given a selection of 5 other users to choose from, which are then marked as seen and should not be shown again when the user requests for another 5 users to be shown. More people can come online during the process.

The problem is, I want a way for each user to be shown in the selection for other users, with redis but an algorithm is mostly what im looking for. I'm trying to implement this in the fastest way possible, using redis if possible but I can also make calls to the database if it's needed.

My current solution is as follows, hopefully someone will have some tips to improve this from O(N) calls.

So each user needs to have a seen set of user_ids. We can have a redis list (queue) of onlineusers. Where we keep poppping users from the left until we find one that isn't in the user's seen set, save it, add to users seen, then push it on the right. Then once we get 5 of those we left push back the ones we left popped off that were already seen.

This is the best I could think of however it is O(N) each time we want to find 5 users for this one user to select from. It's possible (though not likely) that the user has seen a huge amount and is popping off the whole list.

To help understand this better. A naiive approach is to have every single user contain a copy of all online users in the form of a set. So then we simply pop 5 random set members. But this can't work because theres not enough space, and each time a user goes online they'd have to be added to each user's online users. Or deleted when they go offline and those operations are O(N) considering they are done for N users at O(1)

Does anyone have any tips to match users with other users?

标签： algorithm redis

1条回答

爱情/是我丢掉的垃圾

2楼-- · 2019-01-28 16:39

It would be good to know about which kind of data we are talking about. How many users exist? How many will be online at average? How is the ratio of "seen users" compared to all users (sparse vs. dense)?

Modification of your algorithm Don't pop the first but choose a random element from the set of online users. This should improve balancing and may help with amortized complexity depending on the ratio of these two sets!

Alternative Algorithm (more structured; still bad worst-case; should be good if sparse seen)

Keep seen as a balanced tree (O(log n) insertion)
Keep online as a balanced tree.
While not enough users chosen:
- Search for first gap in seen (e.g. [0,1,3,7] -> 2; O(log n) according to SO-link)
- Search for first user >= gap-value (O(log n))
- If user < next_gap_neighbor (in example above: 3; next value after picked gap 2)
- -> pick
- Else
- -> add chosen-gap-value temporarily (for this moment; model-decision how often to update online) to seen OR limit search somehow to > chosen-gap-value (O(log n))

Depending on the data, this should work very good if data is huge and seen is sparse!

0人赞添加讨论(0) 举报

User matching algorithm

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间