recommender bias

Introduction : recommendations

The SoundSuggest application uses data and tries to establish a context for recommendations based on the active user’s neighbourhood and the active user’s top rated items. Although I could not find an official source for the recommender system’s algorithm, it is assumed that it is based on some variant of collaborative filtering [1, 2].

The objective of collaborative filtering is to find a sort of community, i.e., the neighbourhood of the active user, based on similarities between these profiles [3]. It should come as no surprise that certain items will be strongly linked due to geographic similarities between profiles, as users living in similar places are likely to have heard of local artists, while people living in other countries won’t. Belgium has a relatively small music scene compared to the United States or the United Kingdom, as a result Belgian artists seem to cluster together based on the geographic aspects, rather than musical similarities. This becomes apparent when looking at the similar artists for certain Belgian musicians as can be seen on Figure 1.


Figure 1 : recommendations biased by regional factors

From a content-based perspective, collaborative filtering produces more serendipitous suggestions. Still, one could argue that user location introduces a bias into the system. Based on style of music, a band such as Goose is still quite different from a band such as Das Pop. So it is odd that the similar artists for Das Pop are in fact all Belgian – see Figure 2, while there would probably be many other artists that fit their style of music much better that are not necessarily Belgian.

Whether or not this serendipity improves or decreases the quality of recommendations is of course up to the end user. Nonetheless, it indicates some of the limits, as well as the strengths of collaborative filtering, and for the algorithm in particular.

Top similar artists for Das Pop on

Figure 2 : Top similar artists for Das Pop on

Implications for the SoundSuggest application

When creating the data structure used in the SoundSuggest application, as explained here, there is a mismatch between the neigbourhood and some of the recommendations. If the profile contains both well-known artists, as well as artists that are strongly geographically linked, it may occur that the neighbourhood is largely based on the well-known artists. As a result, it is very likely that the connectivity between neighbours and this second group of artists will be very low to non-existent. This phenomenon can be seen in Figure 3, where in fact all the Belgian bands aren’t in any of the neighbouring profiles.

A user of the application may derive that these suggestions are poor in quality. In my opinion, often is this the case. I’m much more interested in artists that are similar to bands I listen to a lot, rather than just local bands just because they happen to be local. To solve this, one option would be to discard the artists with low connectivity in the graph, and search for additional recommendations that hopefully provide better connectivity.

example recommendations and neighbourhood

Figure 3 : Problems with connectedness within the visualization of recommendations and neighbourhood

Final thoughts

In further developing the application, it would probably be interesting to find out what the impact is of non-connected artists on the user experience of the visualization.


[1] Wikipedia, – Wikipedia, the free encyclopedia, 21 March 2013, [Online] Available at: [Accessed on 7 April 2013]

[2], 2013, [Online] Available at: [Accessed on 7 April 2013]

[3] Rajaraman A., Leskovec J. and Ullman J. D., 2012, Mining Massive Datasets


3 responses to “ recommender bias

  1. Could you explain what “neighbouring” means exactly? I’m a bit confused. You say that:

    “If the profile contains both well-known artists, as well as artists that are strongly geographically linked, it may occur that the neighbourhood is largely based on the well-known artists.”

    I’m trying to understand the specific problem. Are you saying that if user A has both well-known bands and local bands in his profile, then recommendations for A will almost always be based on his well-known bands? Later on you say:

    “…rather than just local bands just because they happen to be local.”

    This implies local bands are recommended “too much”. So which one is it?

    To me that sentence implies local bands are

  2. There is a difference between the neighbourhood and the recommendations. Of course I do not know the exact algorithm used by, but I assume that the “top” neighbours, i.e., the ones you get from the API method “getNeighbours”, are largely based on the “top” artists in a profile.
    If the top artists are very well-known, and as a result appear in a large percentage of the profiles, this neighbourhood will likely span many nationalities. Therefore, recommendations for local artists will very likely not be linked to any of these profiles. This is also what figures 1 to 3 suggest.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s