Who Was The Funniest Character on Friends? Analyzing Comedy in All Friends Episodes

Hate it or love it, Friends is one of the most popular sitcoms of all time.

The cast’s characterizations were diverse enough for you to see a little bit of yourself and your friends in each of the characters as they stumbled through life and made bad decisions. This makes it no surprise that, at its peak, the TV show had over 50 million viewers.

One of the biggest debates fans have to this day is: Who was the show’s funniest character? Was it Chandler with his self-deprecating sarcasm? Ross with his nerdy mannerisms? Phoebe with her hippy-boho vibe? Joey? Rachel? Monica?

Depending on who you ask, the answer might differ. For one, I thought Ross was hilarious, while others find him (understandably) annoying.

As a data lover and a huge fan of Friends, I took an intermittent two-year-long stab (seriously, I’ve been working on this for two years) at using data to answer this question.

Who was the funniest character? What made them funny? When were they at their funniest?

By analyzing laughter in the audio files of over 200 episodes of Friends and combining that with each episode’s scripts, I’m here to answer the question we’ve all been asking.

The How

This has been, by far, the most challenging project I have worked on. I started working on this in early 2019. The work I needed to do was beyond my technical capabilities at the time, so it took a lot of trial and error, which meant dropping and picking up the project multiple times.

Without going too much into the details, I built a machine learning model that could detect laughter in an audio file. Luckily, most sitcoms from the 90s had laughter either from a live audience or from a pre-recording. The model had about 95% accuracy and 98% precision.

In a nutshell, what I did was, using Python’s librosa library, I transformed the audio file of each episode into a dataset in which each row is a numerical representation of the sound waves for one second of sound. The model then detects what seconds were laughter based on the sound waves. 

The next step was matching the new dataset with the model-detected laughter to the script and subtitle data to identify who caused the laughter and what was said.

The majority of the code was written in R, but I did most of the audio analysis in Python. I suck at Python, so massive shout out to Allen, who saved me by speeding up my Python code. Go check out his data science blog: http://allenkunle.me.

I usually do not share a lot of the technical details of my work. Still, I was particularly proud of this one (and was tempted to overload you guys with nerdy information, but my editors said no). 

If you would like to know more about the technical details of the project, you can check it out here.

The datasets and a sample of the code can be found on my GitHub here

Let’s get straight into it!

Who Was The Funniest Friends Character?

By Total Number Of Laughs, Chandler Is The Funniest Character Of The Show, Followed by The Two Male Leads, Joey and Ross

We have the three male leads first, followed by the female leads: Phoebe, Rachel, and Monica in that order.

However, this might be misleading. If a character has more lines, there is a higher chance they will cause more laughs, but that doesn’t necessarily mean they’re funnier than someone with fewer lines, right?

If We Look At What Percentage Of Their Lines Were Funny, Everyone Retains The Same Rank Except For Ross Who Drops To The Bottom

Chandler retains the top spot, with 67% of his lines being funny. Unfortunately, it seems Ross had more funny lines just because he had more lines in general. What’s even more shameful is that everyone else retained their rank. 

I liked Ross a lot, so this kinda bummed me out.

Here’s another way to look at it:

Outside of The Main Cast, Janice Was By Far The Funniest, With The Rest Also Being Love Interests Of The Main Cast

Janice, Chandler’s longtime on-and-off girlfriend, was by far the funniest outside the main cast and the funniest overall. This makes sense given her hilarious voice acting that made everything she said ten times funnier. 

The only character here who was not exactly a love interest was Guenther, although most of his funny moments were because he had a crush on Rachel.

Now, Why Was Ross So “Unfunny”?

Compared To The Others, Ross’ Character Revolved Around Love Troubles. Four Out Of The Top Seven Words He Said Were Either ‘Love’ Or A Girlfriend’s Name

I use the top 7 here because some words are tied in their number of occurrences. 

You can see that most of his discussions revolved around talking about his love life, and this might have affected how funny he was because his love life was the most chaotic and painful thing to watch. 

Some of his biggest love arcs were:

  • His first wife, Carol, turned out to be a lesbian
  • His second marriage with Emily ended because he was still in love with Rachel 
  • His continuous, sometimes painful on-and-off relationship with Rachel

If you compare Ross’s top words to the other characters, the others talked about their love lives a lot less.

How Funny Were Their One-on-One Interactions?

The Funniest Interactions Were Between Chandler & Joey, And 4 Out Of The Top 5 Funniest Interactions Involved Chandler

On The Other Hand, Rachel & Joey Had The Least Funny Interactions. With Three Of The Bottom Five Interactions Involving Monica

This probably also explains why their short-lived relationship seemed quite awkward. On the other hand, Chandler’s wife Monica was at the very bottom in one-on-one interaction. It makes me wonder if the showrunners put them together deliberately.

Season 10 Was The Least Funny Season For All Characters Except Ross and Rachel Because of Their Breakup In Season 3

We can all agree that Season 10 focused less on comedy but more on the drama to wrap up the show. It also shows up consistently for all the characters that Season 10 was the least funny.

The only exceptions are Ross and Rachel, and this is probably because the show focused on them finally getting their happy ending that had been teased since season 1.

For both characters, season 3 was their least funny season, and this is where they had a lot of conflict in the relationship, which led to their breakup.

One Thing to Keep in Mind

This entire analysis is a good reminder that: data-driven does not always mean objective. The premise of this analysis is that the pre-recorded laughter is the ground truth for funniness. That simply isn’t true. 

While most people might agree that Chandler was the funniest, not everyone will, and that’s okay because sometimes, the role of data is not to be objective but to be a proxy for the subjective such as how funny a person is. Please feel free to disagree with the rankings.

What’s Next?

I know, I know. It’s been almost three years since my last post. To summarize, moving countries did a number on me. I’m only just recovering two years later.

There’s a 70% chance there will be a new post between March and April, plus I have around 5-6 posts planned for the year, but please don’t hold me to it in case life happens.

I really enjoyed working on this like old times! Thank you for sticking around, and I hope to see you all soon!

Billboard Hot 100 Analytics: Using Data to Understand The Shift in Popular Music in The Last 60 Years

What’s the most common thing you hear from “older” people about the popular modern music? The general theme is: “Your music is too loud and lacks content”. They talk about the “old” days with the meaningful songs, the soulful artistes, the deep bass guitars that can move you to tears. When they say that, they are comparing this:

Downtown by Petula Clark, 1965

To this:

Stir Fry by Migos, 2018

 

There’s a clear difference, obviously. However, this will be taking one data point to make a general conclusion (which humans are very good at). I, being a millennial and a Data Scientist, found this an interesting topic to poke at. Has what makes music “great” really changed that much? Has the sound, the lyrics and the “message” changed? And if they have changed, how exactly have they changed?

Using billboard’s Hot 100 charts from 1950 – 2015 and Spotify’s API, we want to take a closer look at how much popular music has changed in the past six decades and find out what really distinguishes the music of today from the rest.

My Approach

For this post, I define “great music” as making it into the Billboard’s Hot 100. I got the data from a generous GitHub user Keven Schaich. The data contains a lot of interesting features like Sentiment, Gunning fog index (which estimates the number of years of formal education needed to understand a text at first reading), Number of words, Number of repetitive words/phrases etc.

In addition, Spotify has an interesting API endpoint called get_audio_features. The endpoint allows you to get song features like loudness, Instrumentalness (how much instruments are used), energy, liveness (the presence of a live audience), Speechiness, song duration etc. This brings the total song features to about 30 for Billboard’s Hot 100 between 1950 and 2015.

All these features are explained here and here and I will also explain some as we progress in the post.

Initially, I set out to use Python for this project and I did. Kinda. I had my first iteration of data collection all done with Python’s pandas and a python package called spotipy.

Along the line, however, I reviewed my methodology and found a more interesting dataset. For this, I went back to R specifically because of the tidyr::gather() function (it’s so annoying pivoting data in pandas jeez).

Here’s the code in R and Python which are different in most ways except a function called get_audio_features. My final dataset can be found here.

The amount of time I spent on data gathering is in sharp contrast with my other projects because, unlike my other projects, someone took the time to put a ready-to-use dataset together. This is a major reason why I share all the data I gather so hopefully, someone out there won’t spend 6 weeks on trying to gather data.

Let’s begin!

1.   In the past sixty years, we have had only two major changes in music

By using an algorithm called clustering, we can find similarities/clusters of artistes and their music using their song features.

Using this approach, we have two clusters of artistes – The String Lovers and The Poetics. The reason we chose these weird names lies in the two song features that define these clusters best: Instrumentalness and Speechiness.

Instrumentalness predicts whether a track contains no vocals on a scale of 0 to 1. “Ooh” and “aah” sounds are treated as instrumentals as well. The closer the value is to 1, the more likely there is no vocal content (e.g. a soundtrack) and the closer it is to zero, the more vocal it is (e.g. rap or spoken word).

Speechiness detects the presence of spoken words in a track.

  • The String Lovers score high on Instrumentalness but low Speechiness. This means that artistes in this period tend to favor instruments as opposed to speech.
  • The Poetics are the direct opposite. They score pretty high in Speechiness but very low on Instrumentalness.

Figure 1

The other interesting thing about these clusters is when they appear on the Billboards Hot 100.

  • Most String Lovers appeared on Billboard before the 1990s.
  • Most Poetics appeared on Billboard after the 1990s.

Figure 2

  • The 90s itself seemed to be a pivotal time in music as we see with the ~50-50 split between String Lovers and Poetics. This meant that artistes were split between going with this new type of music or sticking to the existing sound.

2.   The use of instruments dropped mostly because rock bands became less popular

Between the late ’60s and the early 2000s, bands were so popular that there were as many bands as solo artistes.

Before the 2000s, the more bands there were in a year, the higher the average Instrumentalness in that year.

Figure 3

However, after the 90s, the number of bands had little or no effect on the use of instruments.

Figure 4

Except the two outliers, the number of bands had virtually no effect on the use of instruments.  This is interesting because, like I mentioned earlier, bands were still popular in the early 2000s.

So, what happened?

I’m sure you guessed it. The TYPE of bands changed.

Figure 5

Before the 90s, about 60% of bands were rock bands – the types typically with one lead singer and a bunch of instrumentalists.

However, from the 2000s to present day, the percentage of rock bands dropped significantly making way for a new brand of bands which were generally made up of ALL singers: Pop bands. Think Destiny’s Child, Pussycat Dolls, Fifth Harmony, One Direction – you name it!

3.   We might also owe the emergence of Poetics to the rise of Hip-Hop

Apart from the increase in Speechiness and use of words, Poetics use two-times more complex words (e.g. Jay-Z saying opulence instead of wealth) than String Lovers and use words with more syllables. One genre immediately pops into everyone’s mind when we think of word-bending artistes: Hip-Hop.

Figure 6

Seeing as Hip-Hop tops all other genres in word-related features, it comes as no surprise that Hip-Hop gained mainstream popularity in the 90s – corresponding to the rise of The Poetics.

Figure 6b.png

4.   While the style of music has changed a lot over time, popular songs for the past sixty years have been mostly about loving women

To arrive at this, I used an algorithm called topic modeling. As the name implies, the algorithm searches for topics in a given text.

In our case, the text are lyrics from billboard songs.

Let’s see how these topics change over the decades:

Figure 7

This is absolutely amazing!

Like the features of songs, song lyrics also fall clearly into two buckets with Topic 1 capturing ’50s to ’80s, Topic 2 capturing the decades after the ’90s and the ’90s as a transition period!

This means that the sound and “message” of songs changed at pretty much the same rate.

So, what are these topics?

Figure 8

The topics are almost the same thing! Top songs have disproportionately been, for the past sixty years, “Yeah, I love my baby”.

There’s also something interesting going on here. A major difference between both topics is that before the 90s, songs might have had a more “direct” approach – you can see that a major topic is “gonna” e.g. “I’m gonna love you”. While after the 90s, it seemed a bit more indirect, like asking for permission hence replacing “gonna” for “wanna”. “Wanna” could also depict a more futuristic, imaginative approach to loving women.

5.   The more “quiet” genres ceased to exist in the Poetic Era

This sort of confirms that we tend to prefer louder music now than before.

Figure 9

The five most “quiet” genres are – Jazz, Swing, Folk, Blues and Disco.

These genres also ceased to exist as popular music in the Poetic Era except Jazz which seemed to survive by one artiste (Norah Jones).

Figure 10

What do these all mean?

In summary:

  • The 90s was an extremely important time in music.
  • The decline of rock bands and the rise of Hip-Hop played a major role in steering music to where it is today.
  • Love is a popular theme across songs for the past six decades but the approach to love might differ across the different eras of music.
  • Yes, modern artistes may be louder but it’s BECAUSE we have content :).
  • Bonus Point: Michael Jackson, despite being most popular in the 80s, is a Poetic! He was ahead of his time!

Fun Stuff and Things to Keep in Mind

  • I took a different (and more fun) approach to showcasing the data for this project. I built a dashboard using HTML, CSS, js and chart.js! The app is not (yet) optimized for mobile so, it’s best to use it on a laptop.

Here’s the link: http://bit.ly/music-dashboard

    • The dashboard has two tabs. The first one “Artist Dashboard”, shows you the average song features for individual artistes.
  1.  Figure 11
    • The second tab “Comparison Dashboard” allows you to compare song features for up to three artistes and looks like the screenshot below.
  2. Figure 12
    • You can share the results on Twitter or Facebook using the icons at the top right.
    • Just in case you forget what the features mean, hover over the title and you’d get a little tool-tip explaining it 🙂
  • The Poetic era (as I like to call it) is an ongoing era so some of these insights may change if we had 2016 to 2018 data (especially with the rise of trap music). However, I don’t expect the effects to be much.
  • It would be interesting to measure how “politically-aware” a song is. I will probably post the outcome of that on Twitter.
  • As usual, I am constrained by data collection methods of the generous GitHub user, Spotify’s algorithm and how Billboard arrives at the Hot 100.

Hope you had as much fun reading this as I had creating this 🙂

A Data Driven Guide to Becoming a Consistent Billionaire

Did You Really Think All Billionaires Were the Same?

Recently, I became a bit obsessed with the one percent of the one percent – Billionaires. I was intrigued when I stumbled on articles telling us who and what billionaires really are. The articles said stuff like: Most entrepreneurs do not have a degree and the average billionaire was in their 30s before starting their business. I felt like this was a bit of a generalization and I’ll explain. Let’s take a look at Bill Gates and Hajime Satomi, the CEO of Sega. Both are billionaires but are they really the same? In the past decade, Bill Gates has been a billionaire every single year while Hajime has dropped off the Forbes’ list three times. Is it fair to put these two individuals in the same box, post nice articles and give nice stats when no one wants to be a Hajime? I think not – especially when, in this decade alone, inconsistent billionaires like Hajime make up over 50% of the total billionaire population. Addressing the differences between billionaires is what this post is about. We are going to highlight interesting facts about the consistent billionaires and ultimately, find out what separates the consistent billionaires from the rest.

Just what do I mean by consistent billionaires? Well, that’s what we’re here for. 🙂

For the Nerds Like Me, Here’s How I did It

  • Data Sources: Most of the data was scraped from 3000 Forbes profiles. Two extra variables were collected from a research paper: The Billionaire Characteristics Database. Billionaires covered are those who are or have been billionaires between 2007 and June, 2017.
  • Data Gathering: Using names of billionaires I created their Forbes profile URLs and collected the data I needed using RSelenium and rvest. I’ll be frank. It was not sexy at all. I did a lot of Excel VLOOKUPS, manual inspections and string manipulation to get a workable data set.
  • Data Cleaning: I created columns from strings using stringr.

The code can be found here.

Just How Many Types of Billionaires Are There?

Here’s what I came up with:

  • The Consistent: These, as the name implies, are individuals who have consistently been billionaires year in and year out. It also includes billionaires that have been away from the list for at most a year (e.g. Mark Zuckerberg in 2008). They should have been billionaires before 2015.
  • The Ghosts: These are billionaires who left the list and have not returned in the past four years. They also should have made their debut before 2015.
  • The Hustlers: This category includes every other billionaire who made their debut before 2015. I.e.
    • Those that left more than once and made a comeback each time.
    • Those who, although made it back to the list, spent more than a year away.
    • Those who are yet to come back but have not spent up to 4 years off the list.
  • The Newbies: These are billionaires that made their debut between 2015 and 2017. They are in a group of their own because I believe it would be unfair to put them in anywhere else as there isn’t enough data to classify them in any other category. Nonetheless, I think it would be interesting to see what they’re up to.

So, let’s get to it!

Did You Know That?

The Consistent billionaires are well-educated.

Close to 55% of the Consistent billionaires have at least one degree.

Billionaire education

In fact, the Consistent billionaires have the most people with a Bachelor’s, PhD, Masters and pretty much every other degree.

The average Consistent billionaire started their businesses at an age seven years older than the average Ghost.

This applies to billionaires who are self-made and started a business. The average Consistent billionaire starts their business in their 30s on average which agrees with the article on successful starting their 30s.

Age at Start

Does the Ghost billionaire starting his/her business at least two years earlier than everyone else say something about younger entrepreneurs being less likely to sustain their wealth? Probably. However, if you look at the Newbies, they mostly started out young too. The question is: Will the average Newbie end up a Ghost or has the playing field changed in the past few years?  We can answer that in a few years. 🙂

The top three sectors that produce the highest percentage of Consistent billionaires are Telecoms, Fashion and Diversified portfolios.

Consistent Sectors

Looks extremely mainstream, right? But Fashion? Really?

Note: Fashion and Retail here does not mean Retail. It means businesses retailing Fashion merchandises like Zara, H & M etc.

African billionaires are the most likely to be Consistent billionaires

Close to 70% of African billionaires are Consistent – more than any other region in the world. The region that comes closest is North America with 53%.

Consitent Region

In the Newbie Era, however, Asia seems to be dominating every other region and this number is mostly driven by China. In fact, over 50% of Chinese billionaires joined the list during this period.

On the other hand, Middle Eastern billionaires are the most likely to be Ghosts. I know what you’re thinking. Oil prices, right? Probably. However, most of Middle Eastern billionaires have diversified portfolios.

There are more billionaires with a PhD than there are drop outs.

This is my favorite.

This applies to all other degrees like MBA, MSc etc. Only professional degrees like Law or Medicine have fewer billionaires than drop outs. However, in the Newbie and Hustler categories, there are even more people with a professional degree than there are drop outs.

Billionaire Degree.png

11% of Consistent billionaires are female.

Female Billionaires

The only category with a more encouraging female-to-male ratio is the Newbie category with about 16 percent. However, given that the global male to female ratio is 50:50, the Newbie category is still 34 percent short. The good news is things are getting better. A woman is close to two times more likely to be a billionaire since 2015 than before that.

64% of Consistent billionaires are self-made.

Self Made Billionaires.png

The only category with a lower percentage is The Ghost. The good news (or bad news – depending on where you hope your wealth would come from) is that the Newbie billionaire has a higher percentage than that. This means that in recent times, more “new” wealth is being generated. Also, it seems being self-made isn’t a peculiar thing seeing as each category has over 60% of their billionaires being self-made.

Cool, Now What?

The billionaires we all know and love are well-educated and frankly, generally boring.

How much does this matter if you want to become a Consistent billionaire?”

To answer that, we will do a bit of Machine Learning (bear with me here, it might get a little technical). Using the h2o.ai machine learning package (I love!), we would train models to predict what category a billionaire will fall into. We would do this for all the categories except The Newbie because, unlike the others, all that distinguishes this group is when they joined the list and not their performance while on it. We would also use truly independent variables to train our models. For example, a variable that was used to create the categories like the number of times they left the list won’t be used. It would be like knowing the answer and working backward if we use variables like that, right? We would then check which variables were the best in predicting a billionaire’s category to answer our question. The code is also available in the same script shared above.

I would first use the purrr and h2o package to find the best algorithm between Gradient Boosting Machines, Random Forest, and Deep Learning.

Models

Looks like the accuracy of the GBM algorithm on the test set beats the other machine learning algorithms.

Let’s check what variables GBM considers most important in predicting a billionaire’s category.

Variable importance.png

We see three variables above the 50% relative importance: Country, Sector and the founding year of the company that got them their wealth.

What does this tell us about Consistent billionaires? For one, it says that while the Consistent may be well educated, that’s certainly not what got them there. It’s not shocking that Country and Sector are important variables but “founding_year” is intriguing. It could mean that it may be getting easier or harder to build a sustainable business.

Again, pretty straightforward and boring. Be in an enabling environment at the right time for the sector you play in and BOOM! You make sustainable wealth. At this point, I feel I am obligated to say that 84% of technology billionaires are in North America and Asia. There are currently none from Africa (See sentence above about an enabling environment for your sector) but then again, you can be the pioneer so take my advice with a bag of salt. Good luck!

Things to Keep in Mind

  • The data was gotten from Forbes. This means that I am inherently constrained by their methods, estimates, and errors. For example, the data says there is only one billionaire from Politics. I’d rather diezani than believe that’s true.
  • At the end of the day, I ended up with over 30 variables and I cannot talk about all of them in one post, so here are some visualizations for you to play around and find out for yourself how to become a Consistent billionaire. 😉
  • Want to find out who the Consistent billionaires are? Find out using the full data set here.
  • In my next post, I am going to address what sectors, countries and founding years are the best in becoming a consistent billionaire and;
  • I have a LITTLE surprise. 🙂