The Myth of Club 27

The term club 27 refers to the observed phenomenon that famous musicians die at a higher rate at the age of 27. Jimi Hendrix, Janis Joplin, Kurt Cobain and Amy Winehouse to name just a few, are members of this questionable club. The media is going wild whenever a new famous person enters this mysterious club. But is there a (statistical) truth behind this? Do musicians really die at a higher rate at the age of 27?

Methodology

The data I used comes from dbpedia, a project “aiming to extract structured content from the information created as part of the Wikipedia project.” Basically you can easily get data from many wikipedia articles without crawling the pages individually. The twist is, that you have to know a bit of sql, which I don’t.

After a lot of struggling, I managed to extracted musical artist via this interface with a death date with the following code. (Don’t bother if you do not understand the code. I barely know what I was doing there).

SELECT DISTINCT ?person ?genre ?died ?born 
FROM <http://dbpedia.org> 
WHERE 
   { 
      {
        ?person a <http://dbpedia.org/ontology/Person> ;
                   dbpedia2:occupation ?genre ;
                   dbpedia2:deathDate ?died ;
                   dbpedia2:birthDate ?born .
        FILTER regex(?genre, "Singer")
        } UNION {
        ?person a <http://dbpedia.org/ontology/Person> ;
                   dbpedia2:occupation ?genre ;
                   dbpedia2:deathDate ?died ;
                   dbpedia2:birthDate ?born .
         FILTER regex(?genre, "Musician")
         }
    } 

Dbpedia only returns up to 10000 results so I had to run the query twice. The first one with LIMIT 10000 and the second with OFFSET 10000 LIMIT 10000. The data is conveniently stored in json files.

I noticed that, among others, Amy Winehouse was not in the dataset. I dug around the documentation of dbpedia and found that some musicians just go under the term artist or even person. I tried to filter out Singers and Musicians from the general persons ontology with the following code.

SELECT DISTINCT ?person ?genre ?died ?born 
FROM <http://dbpedia.org> 
WHERE 
   { 
      {
        ?person a <http://dbpedia.org/ontology/Person> ;
                   dbpedia2:occupation ?genre ;
                   dbpedia2:deathDate ?died ;
                   dbpedia2:birthDate ?born .
        FILTER regex(?genre, "Singer")
        } UNION {
        ?person a <http://dbpedia.org/ontology/Person> ;
                   dbpedia2:occupation ?genre ;
                   dbpedia2:deathDate ?died ;
                   dbpedia2:birthDate ?born .
         FILTER regex(?genre, "Musician")
         }
    } 

I do not make a claim that the data I got this way is complete, yet at least all the famous dead musicians I know are in there.

So what does the death distribution by age look like?

See the huge spike at 27? Me neither! Well technically, my analysis is a tiny bit flawed. Most of the musicians in the dataset can not really be considered famous, which is kind of a requirement to be eligible as a club member.  I will leave this part for a later post and we will move on to something more interesting.

Breaking it down into genres

Dbpedia also returns the genre the musician was active in, so we can break the above chart down and see if certain genres have a higher death toll in early ages.

The result is quite stunning. It seems that metalheads die way earlier than others, especially classical and jazz musicians. The following table shows the mean death age and its standard deviation for the set of genres.
Genre Mean SD
Blues 63 16
Classic 71 16
Jazz 67 17
Metal 40 11
Pop 59 20
Rock 49 14

I guess due to their lifestyle, it is really not that surprising that metalheads die earlier than for instance classical music artists.

Science vs. Music

Additionally to the Musicians, I also got a huge set of people tagged as scientists from dbpedia. I was just curious how their lifespan compares to the musicians.

As the figure shows, scientists tend to live much longer than musicians. As a fellow scientist, this makes me feel content.

Posted in: Data Analysis |

Tagged with:

Written by Dmathlete

Leave a Reply

Your email address will not be published. Required fields are marked *

*

*