Users of my application (it's a game actually) answer questions to get points. Questions are supplied by other users. Due to volume, I cannot check everything myself, so I decided to crowd-source the filtering process to the users (players). The rules are simple:
- each user is shown a question to rate as good/bad/unsure
- when question is rated 5 times as "bad" it is removed from the pool
- when question is rated 5 times as "good" it is removed from the poll and flagged to be played by other players who have not seen it
If everyone could see everything, this would be easy. However, later in the game phase, users shouldn't get questions they have already seen. This means that users should not see all the questions, and exactly those they don't see would they get to play (answer) later in the game.
Total number of questions is much larger than number of players, new questions are added daily and new players come all the time, so I cannot just distribute in advance.
I'm looking for some algorithm that would maximize the number of rated playable (i.e. unseen) questions for all players.
I tried to google, but I'm not even sure which terms to put in the search box, and using stuff like "distribution", "voting", "collaborative filtering" gives very interesting but unusable results.
Ratio of good vs bad questions is 1:3, ie. 25% of questions are rated good. Number of already submitted unrated questions is over 10000. Number of active users with privilege to vote is around 150.
I'm currently considering splitting the question pool and user base into 2 parts. One part of the user base would check the question for the other part and vice versa. Splitting the questions is easy (odd vs even for example). However, I'm still not sure how to divide the user base. I thought about using odd/even position in "top question checkers" list, however the positions on list changes daily as new questions are checked.
Update: I just asked a sequel to this question - I need to periodically remove a fixed number of questions from the pool.
I'm unaware if there is a specific, well known algorithm for this. However this would be my line of thinking:
So my approach to the final algorithm would be:
[The above can be further tweaked, considering the fact that when the user first register, there will be already a pool of approved but unseen questions on the site]
Of course you can heavily influence the behaviour of users by giving incentives for meritorious activity (badges and reputation points on SO are a self-explanatory example).
EDIT/ADDENDUM: The discussion in the comments clarify that the GR is fixed, and it is one question per day. Furthermore, the OP states that there will be at least 1 new approved question in the system every 24 hours. This means that it is possible to simplify the above algorithm in one of the two forms:
If the user can vote only AFTER he answered his daily question:
If the user can vote even BEFORE answering his daily question:
This is such that if a user is voting all votable questions on the system and then answers his daily one at 23:59, there will still be a question available to be answered at 00:00, plus 24h time for the system to acquire a new question for the following day.
HTH!