Putting the Sex in SNA: Hook up Networks

Imagine you are a network scientist meeting new people in a bar. At one point, there is always the question: “So what is your research topic?”. Shocked and horrified, you are looking for an answer that does not scare away your new acquaintance. “algorithmic graph theory”? Sounds to nerdy. “social network analysis” ? Sounds better but you are tired of explaining that this is not the same as browsing facebook all day. Maybe an example from your work! “Protein Interaction Networks”? Oh god i had too many beers to explain that. 

Hook up Network of Grey’s Anatomy

For the described situation, it is always good to have an example ready that is relevant in real life.
And what could be more relevant than talking about “hook up networks”. Yes that’s a thing and you CAN do serious research with them. Or use them as a pick up line. 
     

The first time I stumbled upon these kinds of networks was when i was looking for an interesting example to show some basic network modeling approaches in a class. What I found was the “who slept with whom” network of Grey’s Anatomy [1,2]. Everybody who is familiar with that show will agree that besides hospital stuff, there is a lot of hanky-panky going on.

Below you find a visualization of the network, where the node size is proportional to the number of hook ups.

Comparing Grey’s Anatomy with the “real world”

Reddit user /u/kreekkrew had a fascinating project during his time in college. He recorded every hook up within his group of friends and published this matrix with the following explanation:
“This shows who has made out with who in my group of friends over the course of my time in college.
Criteria for data collection:
1. To be on the chart, a person needs to have been to at least two of our parties since my freshman year (2010).
2. For a make-out session to count, it needs to:
  • be with someone else on the chart, at any point in time,
  • last more than 10 seconds, and
  • have some level of seriousness behind it. For example, you can’t make out with someone for 5 seconds just to get on the chart and increase your count for the night. (Yes, that has happened.)
Population is composed of 26 people (61% male/39% female), aged between 18 and 24, all at an engineering school (believe it or not) in the US. Obviously, names have been initialized for the sake of privacy.”

Of course, I turned his matrix into a network
Now we can compare how Grey’s Anatomy relates to the real word (as real as engineering school gets…) The table shows some simple statistics for both networks.
Most of the statistics should be self explanatory, except maybe the 4-cycle statistic.

A study on sexual networks [3] found a prohibition against coupling with a former partner’s former partner’s former partner (a 4 cycle) due to status implications. That is, counting 4 cycles allows us to quantify a level of potential drama, since awkward status implications arising from other’s sexual relationships make for compelling entertainment.

What do we learn?

Engineering school is far more dramatic than Grey’s Anatomy (compare 4 cycles). To make Grey’s Anatomy a bit more realistic, they should maybe implement some more same sex coupling. I guess “McDreamy” hooking up with another male doctor would create quite some confusion among the female viewership. 
And poor dudes at Engineering school. Females are already quite rare there (Which is itself quite sad!), and the few that go there decide to rather make out with each other…

Friends and Hypergraphs: The One With All The Networks

Undoubtedly, Friends is one of my favorite tv series. I guess i am not the only one there, since the hosting of all episodes on netflix created quite a stir in the online community. The story of the show is quite easy to follow: Ross loves Rachel, Ross dates Rachel, Rachel breaks up with Ross, Rachel loves Ross, Ross marries Rachel, Ross divorces Rachel, Ross loves Rachel, Ross and Rachel have a happy end. Oh yeah between all the Rachel Ross dilemmas, Monica marries Chandler, Phoebe sings smelly cat and Joey does all kinds of shenanigans. So the question is, is the “Ross and Rachel” story the most central element of the show?
To answer this question, I am gonna look at a dataset of shared plotlines throughout the whole show. That is, which subset of the six characters appeared  together in in a plot during an episode. These plots can range from simply hanging out together in Central Perk to some hanky-panky in the bedroom. We can see a shared plotline as some form of interaction and therefore analyse the show from a network perspective. Great! That is my area of expertise! However, what renders the analysis a bit more complicated is the fact, that plotlines can consist of more than two characters, creating a link with more than two endpoints. So we are not just dealing with a regular network, but with a hypergraph.

Network Visualizations of all Episodes

I spent a lot of time on trying to come up with a visualization, that shows hyperedges in a pretty way. But i failed horribly (That’s why I am doing network analysis and not a network drawing, i guess). So i decided to split the hyperedges into regular edges. That means, if there was a plotline consisting of Monica, Chandler and Joey, i created the edges (Monica, Chandler), (Monica, Joey) and (Chandler, Joey). I did that with all the plotlines of each episode and counted the number of times a certain storyline occurred and aggregated these counts for each season. So in the end, i got 10 different networks. The edge width indicate how often two characters shared a plotline during the respective season.

Clicking through these figures, I always start thinking about all the funny scenes of the respective seasons. I think it is time to rewatch it for the 10th time!

Who is the most central character?

Visualizations are fun and stuff but they do not really help us to determine the most central character of the show. Since I deal with Network Centrality day in, day out i have a lot of methods up my sleeves to deal with this problem. However, most of them are not really applicable on hypergraphs. The only measure that can be used quite straight forward is eigenvector centrality. So lets do some mild math:

A simple network is usually represented in an adjacency matrix $A$ where $A_{ij}=1$ if there is a link between actor $i$ and actor $j$ and $A_{ij}=0$ otherwise. Since in our case, edges have no directions. $A_{ij}=A_{ji}$ and therefore $A$ is a symmetric matrix. Thanks to Perron-Frobenius, we know that there is a real eigenvalue $lambda$ which is bigger than every other eigenvalue of $A$. For this eigenvalue, the following equation holds
$$ Av=lambda v$$
The entries of the vector $v$ are then used to rank the actors of the network. But how can we interpret $v$? The short and simple (and slightly wrong) explanation is, that actors are considered important, if they are connected to other important actors. So it is not just the number of connections, but also the quality of these connections.
When we deal with hypergraphs, we are faced with the problem, that we can no longer represent our network with an adjacency matrix since links can have more than two endpoints. Instead, I will use the so called incidence matrix $E$. The incidence matrix has as many rows as the network has links and as many columns as actors are present. So $E_{ij}=1$ if actor $j$ takes part in edge $i$.

In order to use the eigenvector centrality concept on $E$, it first has to be projected to a square matrix in the actor space. This is done by multiplying $E$ with its transposed $E^T$, i.e. we have the equation
$$E^TEv=lambda v.$$

The interpretation of $v$ is the same as before and so it should reflect the importance of the characters. But before looking at the show as a whole, I will show the importance rankings of the characters in each season. Or in other words: Who were the most central characters in season 1 to 10?  Lets take a look at the seasonwise entries of the vector $v$ and its induced ranking

Original size can be found here

I think the values and rankings reflect the storyline of the seasons quite well. For example season 1 mostly deals with Rachel becoming more independent and the whole Ross and Rachel thing. Season 4 to 6 mainly deal with the relationship of Monica and Chandler, therefore, they are should be the most central characters during this period. Notable is also the position of Phoebe. During the whole show the story never really focuses on here, such that here position within each season ranking is always quite low.

Now lets consider all interactions of all episodes at once. That is, we want to know who is the most central character in the show. And it is…

…CHANDLER! That was kind of surprising to me! But even more surprising is the low overall rank of Ross. Shouldn’t he be at least as central as Rachel, since the whole show is about the relationship of Ross and Rachel?

Of course one could question my relatively simple approach on finding the central characters and of course one could question the dataset. But then again, this is a blog about mildly scientific topics, so…yeah… take the results as they are but do not over interpret them. Also, because i am going to show in an upcoming post, why the results are as they are.

Acknowledgement

A big thank you goes to Alex Albright who not only provided the dataset but also some valuable discussions which actually motivated me to write this blog entry. Please check out her blog too!