Revamping the similar artists formula, Part II


Jan 2 2012, 5h27

Last Saturday, I discussed the issues with the current similar artist formula...especially Metallica being top 10 in similarity with The Offspring (like what?).

Well, since then, I put my programming skills to work and well, I built a little program to test to see it would work with other artists. Results were noticed on several bands/artists as their similar artists over went a make over. However, it did lead to new problems as I tried testing it out super-mainstream acts including our favorite band...Radiohead.

Rage Against the Machine showed up as a similar artist.

...well, apparently. Radiohead's top 250 similar artists currently are all over the place in random genres. A bigger batch of similar artists would be needed to get the best results, but the question my new formula better. I'm going to show you some examples of artists top 25 similar artists...before, and after. You guys make the decision.

Oh, you don't remember the formula? Very well...time for a quick summary
*First, grab all 250 similar artists for Artist A. Obtain them in the given order.
*Next, for each similar artist starting with the one with the highest original rank, look at the first 3 tag and count how many of the tags are similar to the tags in Artist A. Resort based on this. You should end up with 4 groups starting from top (matches all 3 tags, matches 2 tags, matches 1 tag, matches no tags)
*Next, break the ties by resorting each group based on matches among the first 4 tags (which will spawn even more groups)
*Then finally, re-sort again, this time on the top 5 tags. If there are still ties, break the ties based on their original rank.

It's a lot of technical and algorithm stuff that is complicated for the average user, so I won't go that deep. See my example in the first journal entry here for an example with The Offspring (though I work a little backwards here).

Also, I have banned several are some of them
*Tags that are the same as the artist - not all have been banned, but the common ones have been (ex: for Red Hot Chili Peppers). These tags simply group together projects involving members of a band...but really, they may not have the same style.
*Tags referring to geographical locations..., , , . The geographical location doesn't make sense at all. I haven't gotten into geo-genre tags much yet as they may require research before banning (some countries may have a genre as a specific style).
*Decade tags. Similar artists do not have to reside in the same decade. If there's a Led Zeppelin clone playing the exact same style of music today that Zep played in the 60s and 70s, then perhaps they should be similar.
*Nonsense pretty much know them..., , , , ....they don't reflect a genre or style
*The tag. To be honest, this tag was holding a lot of stuff back, especially those that are also alternative rock. This tag is incredibly abused as well to hell, so I dropped that tag. Still no decision on the tag...I'll have to run a few more tests through different artists to make that decision.

So there you go...the banned tags. I'll go through several examples...these are the top 20 artists, before and after.


Thom Yorke
Johnny Greenwood
Sigur Rós

Arcade Fire

Grizzly Bear
The National

Joy Division
Massive Attack
Pink Floyd
The Horrors

#1 Placebo
#2 Pixies
#3 R.E.M.
#4 The White Stripes
#5 Incubus
#6 Weezer
#7 Blur
#8 Coldplay
#9 Kasabian
#10 Arctic Monkeys

#11 Oasis
#12 The Verve
#13 The Last Shadow Puppets
#14 Suede
#15 Primal Scream
#16 Manic Street Preachers
#17 Muse
#18 Arcade Fire
#19 The National
#20 The Strokes

I'm not sure about both lists. Radiohead seems to really have only one similar artist...themselves. But which one is closest?

Lets do another punk rock band other than The Offspring. How about Rise Against


Sum 41
Billy Talent
A Day to Remember

Strike Anywhere
The Offspring
Breaking Benjamin
Three Days Grace
Bullet for My Valentine

Story of the Year
Papa Roach
Bad Religion

3 Doors Down

Ugh...I could just puke to hell if I'm told most of these bands are similar...

#1 Strung Out
#2 Nations Afire
#3 A Wilhelm Scream
#4 Last of the Believers
#5 Good Riddance

#6 Only Crime
#7 88 Fingers Louie
#8 Satanic Surfers
#9 Pulley
#10 Much the Same

#11 Strike Anywhere
#12 Pennywise
#13 Bad Religion
#14 Propagandhi
#15 No Fun at All

#16 H2O
#17 Polar Bear Club
#18 AFI
#19 No Use for a Name
#20 Lagwagon

Not only has every single non-punk rock band has been removed from the current list, but even mainstream punk rock bands like Sum 41 and The Offspring are also gone. The result? More melodic hardcore bands in the top 20. Strung Out is now the most similar. Sum 41 and The Offspring show up between #21-30. And Nickelback? A long way 139. This is probably the main reason why the tag was not removed.

So, what about metal? How bout some Metallica?

Iron Maiden

Black Sabbath
System of a Down
Guns N' Roses

Lou Reed & Metallica
Machine Head
Ozzy Osbourne

Avenged Sevenfold
The Offspring

WHAT THE FUCK IS THE OFFSPRING DOING HERE? Literally. Well, for the most part, it looks OK with Megadeth, Slayer, and Anthrax on top first but...AC/DC, GNR...they sure may have been influenced by them, but would you really call them similar to Metallica? Would anyone even consider Slipknot or SOAD similar to Metallica either? Avenged Sevenfold?

Let's see what our new formula has spawned:

#1 MD.45
#2 Black Tide
#3 Megadeth
#4 Anthrax
#5 Testament

#6 Acid Drinkers
#7 Kat
#8 Turbo
#9 Iron Maiden
#10 Black Sabbath

#11 Motörhead
#12 Ozzy Osbourne
#13 Black Label Society
#14 Dio
#15 Bruce Dickinson

#16 TSA
#17 Lordi
#18 Danzig
#19 Steve Ouimette
#20 Slayer

Hmm...kind of strange here. Slayer actually dropped while Testament on the other hand arised. Now...MD.45 comes up #1...I dunno, many claim it's punk rock, but their top tags are thrash. But what the fuck are Acid Drinkers, Kat and Turbo? They're thrash metal bands from Poland. Both shared thrash metal and heavy metal tags with Metallica, thus the high ranking since thrash metal and heavy metal are also among Metallica's top tags too.

Slayer only shared two tags (thrash metal, heavy metal) while the Polish metal bands shared all 3 (since was banned due to location). It's quite interesting to be honest. Similar artists are supposed to get us into new artists. Not sure if they're good or not, but perhaps you might wanna give a spin to those artists.

What bout hip-hop? Let's take a random hip-hop artist....ok, Kanye West (I had to ban the tag in order to get similar artists).

Jay-Z & Kanye West
The Throne
Kid Cudi
Big Sean

Lupe Fiasco
Pusha T

The Weeknd
Kendrick Lamar
Mr Hudson

The Game
Rick Ross
Frank Ocean

I dunno...but I kind of feel most of the similar artist are affiliated in some way to G.O.O.D. Music.

#1 Diddy - Dirty Money
#2 Chris Brown
#3 Nelly
#4 Nicki Minaj
#5 Timbaland

#6 Pharrell
#7 Fonzworth Bentley
#8 R. Kelly & Jay-Z
#9 Mase
#10 Puff Daddy & The Family

#11 Drake
#12 T-Pain
#13 P. Diddy
#14 Diddy
#15 Twista

#16 Bow Wow
#17 Chris Brown & Tyga
#18 Will Smith
#19 Usher
#20 R. Kelly it just me or did this get a little more on the poppy/rnb side (well Kanye did do more poppy stuff on the last 2 record). LOL at the different version of the guy known as Diddy on there (hip-hop, rap rnb, pop, soul were the top tags for Kanye West after other tags were dumped...including hip hop (duplicate tag for hip-hop), kanye west (for obvious reasons) and...GAY FISH. TBH, most people tend to consider Kanye as a good musician rather than an MC.

One more example. I have also banned the because it referred to a radio format rather than a genre. One band I tested was Queen:

Freddie Mercury
Brian May
Roger Taylor
Queen + Paul Rodgers
Freddie Mercury & Montserrat Caballé

The Cross
Deep Purple
Queen & David Bowie
Led Zeppelin
Pink Floyd

Electric Light Orchestra
The Who
The Beatles

Black Sabbath
David Bowie

Well...most of the top 10 consists of random Queen side projects. But do they play the same exact style...or even have a good discography to choose from? Queen & David Bowie only has one song of existence! And the next 10 artists...hmm...feels like Queen takes in a bit of those bands, but they aren't even similar to each other. What bout this new formula?

#1 Slade
#2 Sweet
#3 Suzi Quatro
#4 Queen + Paul Rodgers
#5 Kiss
#6 The Darkness
#7 Mott the Hoople
#8 Styx
#9 Foreigner
#10 Freddie Mercury

#11 Roger Taylor
#12 David Bowie
#13 Alice Cooper
#14 Bon Jovi
#15 Heart
#16 Jon Bon Jovi
#17 L.A. Guns
#18 Journey
#19 Deep Purple
#20 Led Zeppelin

Well, most of the original British Invasion is gone...some of those arena rock bands though are now in (Styx, Foreigner, and Journey to be specific)...Alice Cooper is in, and he's considered glam rock which Queen sometimes goes into...not bad. Most of the Queen side projects are gone, and AC/DC and Aerosmith seems to belong somewhere else (got to go outside the top 30 to find them). The top band is Slade...based on their page, they seem to also incorporate glam rock and hard rock together. Same with Sweet as well. Remember Queen still has hard rock influences to their sound as well, which is why not all hard rock band on the original list are gone. Seems to feel like a better list IMO, but what about you?

Well, I'm prolly going to go through more artists and search for more tags. Perhaps I could show you some more examples as well from other genres. It should be known that not all similar artists formulas are perfect. And there will always be artists that aren't similar at all.

Remember, I'm only able to resort the original top 250 similar artists. If there are some that are clearly similar to the given artist, but not in that artist's top 250 similar artists, it may not be caught. IMO, this formula is better applied over a similar artist pool of more than 250...probably a lot more artists.

So take a look and pass me some feedback...whether it be on a formula, tags to include, and perhaps ask me for a few examples.


  • terekest

    This seems to be working pretty well.. Better than the current system for sure, but it seems that in some cases (if new system was to be applied), certain bands need a personal approach to get the most similar artists. Maybe it is something that you could fix, I don't know. Anyway, it is a great idea in general and should really get out in the open. Maybe you should try to explain the concept a little bit shorter, so that the long text wouldn't scare away the potential readers ;)

    Jan 2 2012, 17h04
  • SaJaehwa

    You're doing a great job here. I think it's a nice idea as well, but the tags might be a problem - as you self said. I mean, some people really abuse these tags - that should be adjustest then somehow. Thumbs up, it's really well thought so far.

    Jan 19 2012, 16h21
Ver todos os 2 comentários
Deixe um comentário. Faça login na ou cadastre-se agora (é gratuito).