Algorithm for picking thumbed-up items

2020-07-27 18:51发布

问题:

I have a Pandora-like piece of software where the user can thumb up a song or thumb down a song. The software, called Chavah, is Silverlight + C#, but this is a language- and platform-neutral question.

My software needs to choose a song based on the user's preferences. I need a good algorithm for this.

I want my software to choose a song to play given the following requirements:

  1. Thumbed up songs should be favored, and played regularly.

  2. Unrated songs (neither thumbed up nor down) should still be played; after all, the user may only have 2 total thumbed up songs.

  3. Thumbed down songs should be played rarely.

  4. Whatever the algorithm, songs should not repeat often.

Given these design decisions, is there some good algorithm here?

I have some code that grabs all songs, liked songs, and disliked songs:

var allSongs = ...
var likedSongs = allSongs.Where(s => s.LikedByUser(...));
var dislikedSongs = allSongs.Where(s => s.DislikedByUser(...));

Any simple ideas for picking a good song for the user?

回答1:

You can weight songs; say every song has a score of 1.0 in the beginning, 0.5 if thumbed down and 1.5 if thumbed up. Then you pick a random element from the set of all scores, with its probability however weighed by its, um, weight. The fast and dirty approach I'd think of here would be to sum over all weights. Pick a random number smaller than that sum. Loop over all songs until CurrentWeight+SongWeight > RandomNumber (else CurrentWeight+=SongWeight)

Of course you can make this arbitrarily more complex by introducing collaborative filtering :)

Imagine five songs. First two thumbed up, next one thumbed down, two neutral.

{1: 1.5, 2: 1.5, 3:0.5, 4: 1, 5:1}

The sum of this is 5.5. Now we pick a random number < 5.5 and see: It's 2.43789

Now let's find the song this random number belongs to.

Start with CurrentWeight = 0. First song's weight = 1.5. CurrentWeight+1.5 < 2.43789 -> we continue, but increase CurrentWeight by this song's weight.

So CurrentWeight = 1.5 now. Next song's weight: again 1.5. But now, CurrentWeight+1.5 == 3 > 2.43789. This means we chose the second song!

What you do here is to basically pick a random spot on a line, but increase the "territory" on that line that would choose a song if the song is thumbed up.

Whether this creates many repetitions or not basically depends on how strongly you increase/decrease the songs' weights.



回答2:

i assume that Pandora uses a tagging system to categorize their songs. In this way, you devlop a profile of a user's musical tastes by sampling the highest frequency of tags "thumbed up". Unfortunately for you, they draw from a vast database of tagged music to offer their users, which is probably harder to develop than the algorithm to pick them.



回答3:

You could also create a rating system for the band/singer and genre, and based on each "thumb up", you could add 1 (or whatever value you decide) to both the band/singer and the genre. Then, when it's time to decide the next song to play, you can weight the decision based on said user's interest in a band/singer and/or genre. And the reverse for "thumbed down" songs.



回答4:

Maybe something like this:

1) Create three streams (or playlists) of randomly-ordered songs like your code snippet, but make them mutually-exclusive: Liked, Not Liked, Not Rated. Based on the size of these groups, they'll each have a duration before they repeat.

2) Create a function that merges the streams into a single stream, while obeying certain rules. The main rule would be the desired mix of the streams, say 60% liked, 38% not rated, 2% disliked, or whatever. You'd have to deviated from the desired mix only when the duration of the Liked stream is too short to follow the 'minimum time between song repetition' rule.

3) You probably also want to periodically re-calculate all of the streams. The tricky part here might be to avoid repetitions caused by the recalculation. Maybe your random sort in (1) could also take into account some kind of weight or offset attached to recently-played songs.