TidyTuesday – Media Franchises

I finally accomplished the Data Science in R track on Datacamp, which means, I have more time to do and blog about data analytics and data science. For a start, I chose to participate in the latest Tidytuesday, with a small explorative data analysis. This weeks topic is all about media franchise powerhouses. So lets dive in and take a look at the data set.

1
2
3
library(tidyverse)
media_franchises <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-07-02/media_franchises.csv")
glimpse(media_franchises)

As usual in tidy Tuesdays, a big focus lays on the tidyverse, so I’ll load that package before doing anything else.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Observations: 321
Variables: 7
$ franchise        <chr> "A Song of Ice and Fire /  Game of Thrones", "A Song of Ice and Fire / …
$ revenue_category <chr> "Book sales", "Box Office", "Home Video/Entertainment", "TV", "Video Ga…
$ revenue          <dbl> 0.900, 0.001, 0.280, 4.000, 0.132, 0.760, 1.000, 0.500, 0.447, 2.200, 0…
$ year_created     <dbl> 1996, 1996, 1996, 1996, 1996, 1992, 1992, 1992, 1992, 1992, 2009, 2009,…
$ original_media   <chr> "Novel", "Novel", "Novel", "Novel", "Novel", "Animated film", "Animated…
$ creators         <chr> "George R. R. Martin", "George R. R. Martin", "George R. R. Martin", "G…
$ owners           <chr> "Random House WarnerMedia (AT&T)", "Random House WarnerMedia (AT&T)", "…

A glimpse into the data shows a little more, than a handful of features and 321 observations. In this rather small data set, every observation represents an unique combination of a franchise and its category, let it be book sales or cinema box office revenues. As Disney is currently on a big shopping stroll across the industry, I want to examine their position in the market and check out the diversity of their revenue stream across franchises.

Disney, the biggest player

I already spotted a minor problem of the data. The franchise owner are mostly separated into sub-companies, so that for example Marvel and Star Wars are different entities, but basically belong to Walt Disney. The same goes for Sony or other media houses, like AT&T. I identified some of the biggest players and summarise them, based on their mother company.

Before, creating an overview of the biggest motherships in the media industry, I want to check, whether some of the bigger ones are cooperating on franchises. For this, I assign dummy variables for each company and add them together. Every franchise, which is a cooperation should then have a value of two and can easily be filtered.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
library(kableExtra)

media_franchises_company <- media_franchises %>% 
  mutate(is_WaltDisney = 0,
         is_Sony = 0,
         is_ATT = 0,
         is_Nintendo = 0,
         is_Shueisha = 0)

media_franchises_company[grep("Walt Disney", media_franchises$owners),"is_WaltDisney"] <- 1
media_franchises_company[grep("Sony", media_franchises$owners),"is_Sony"] <- 1
media_franchises_company[grep("AT&T", media_franchises$owners),"is_ATT"] <- 1
media_franchises_company[grep("Nintendo", media_franchises$owners),"is_Nintendo"] <- 1
media_franchises_company[grep("Shueisha", media_franchises$owners),"is_Shueisha"] <- 1

media_franchises_company$cooperation <- rowSums(media_franchises_company[,8:12])
filter(media_franchises_company[,c(1:7,13)], cooperation == 2)

We can see, only the spider-man movies are made in a cooperation between Disney and Sony. It shouldn’t have a big impact, based on revenue, but nevertheless, I will exclude this franchise and give it an unique name. Below, you can see an overview of the cooperative spider-man franchise, right from the original data.

name_of_franchise revenue_category revenue year_created original_media creators The_current_owners_of_the_source_materials_rights cooperation
Marvel Cinematic Universe Box Office 21.000 2008 Film Marvel Studios Walt Disney Studios (The Walt Disney Company) (franchise) Sony (Spider-Man films) 2
Marvel Cinematic Universe Home Video/Entertainment 5.000 2008 Film Marvel Studios Walt Disney Studios (The Walt Disney Company) (franchise) Sony (Spider-Man films) 2
Marvel Cinematic Universe Merchandise, Licensing & Retail 5.000 2008 Film Marvel Studios Walt Disney Studios (The Walt Disney Company) (franchise) Sony (Spider-Man films) 2
Spider-Man Box Office 6.000 1962 Comic book Stan Lee Steve Ditko Marvel Entertainment (The Walt Disney Company) (franchise) Sony (films) 2
Spider-Man Comic or Manga 1.000 1962 Comic book Stan Lee Steve Ditko Marvel Entertainment (The Walt Disney Company) (franchise) Sony (films) 2
Spider-Man Home Video/Entertainment 1.150 1962 Comic book Stan Lee Steve Ditko Marvel Entertainment (The Walt Disney Company) (franchise) Sony (films) 2
Spider-Man Merchandise, Licensing & Retail 14.000 1962 Comic book Stan Lee Steve Ditko Marvel Entertainment (The Walt Disney Company) (franchise) Sony (films) 2
Spider-Man Music 0.212 1962 Comic book Stan Lee Steve Ditko Marvel Entertainment (The Walt Disney Company) (franchise) Sony (films) 2
Spider-Man TV 0.880 1962 Comic book Stan Lee Steve Ditko Marvel Entertainment (The Walt Disney Company) (franchise) Sony (films) 2
Spider-Man Video Games/Games 1.308 1962 Comic book Stan Lee Steve Ditko Marvel Entertainment (The Walt Disney Company) (franchise) Sony (films) 2

Is Disney the biggest player in the market?

The plot below shows, Disney leads with no real competition. But this chart doesn’t recognizes the current state, but summarises all franchises, across all past years. Including the development over time, would be very interesting, but is nothing the data can provide.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
media_franchises_company <- media_franchises_company %>% 
  mutate(company = ifelse(is_WaltDisney == 1 & is_Sony == 1,"Disney & Sony",
                          ifelse(is_WaltDisney == 1, "Walt Disney",
                                 ifelse(is_Sony == 1, "Sony", 
                                        ifelse(is_ATT == 1, "ATT",
                                               ifelse(is_Shueisha == 1, "Shueisha",
                                                      ifelse(is_Nintendo == 1, "Nintendo","Other")))))))
                                               
n_other <- media_franchises_company %>% 
  filter(company == "Other") %>% 
  group_by(owners) %>% 
  summarise(revenue = sum(revenue)) %>% 
  arrange(desc(revenue)) %>% 
  nrow()

Disneys revenue diversity

Now, my big question is, how well the revenue stream is diversified or maybe how much Disney depended on one type income. Therefore I want to depict every franchise, its revenue and where the money comes from. But first things first. I structured the revenue categories in three main groups and colored them to provide a better overview. Regarding the coloring, I got inspired by the great SMYCK Color Scheme from John Paul Bader, aka Hukl. I plan to transfer these colors into a full fledged theme.

  1. Everything motions picture (Cinema, Home Video, TV) – Red
  2. Secondary revenue (Merch, Music, Video Games) – Green
  3. Written (Books, Comics/Manga) – Yellow
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
walt_disney <- media_franchises_company %>% 
  filter(company == "Walt Disney" | company == "Disney & Sony") %>% 
  mutate(revenue_category = as.factor(revenue_category))

walt_disney$revenue_category <- fct_recode(walt_disney$revenue_category, 
                                            "Books" = "Book sales",
                                            "Box Office" = "Box Office",
                                            "Comics" = "Comic or Manga",
                                            "Home Video" = "Home Video/Entertainment",
                                            "Merch" = "Merchandise, Licensing & Retail",
                                            "Music" = "Music",
                                            "TV" = "TV",
                                            "Video Games" = "Video Games/Games")

walt_disney$revenue_category <- fct_relevel(walt_disney$revenue_category,
                                            "Box Office","Home Video", "TV",
                                            "Merch", "Music", "Video Games",
                                            "Books", "Comics")

cp <- c("#e66350", "#ba5041", "#783329", "#addb46",
        "#87ab37", "#5b7325", "#fcd649", "#cfaf3c")

Finally, I can group by franchise and compare the Revenue Money.

1
2
3
4
5
6
7
wd_franchise_revenue <- walt_disney %>% 
  group_by(franchise) %>% 
  summarise(revenue_franchise = sum(revenue))

walt_disney_plot <- walt_disney %>% 
  left_join(wd_franchise_revenue, by = "franchise")

Based on this chart it becomes obvious, Disney is mostly a toy company. The largest part of revenue from their biggest franchise was generated by Merch, aka Toys and T-Shirts. Some things are notably, nevertheless. Revenue from Marvel franchises is mainly generated by motion picture based sells.

I think the biggest problem with this data is its time independence. Hugh revenue contributors such as Star Wars already generated a lot of money, before they joined the Disney Family. Same goes for the X-Men which until recently belonged to Fox. I believe, when looking a the diversity across time, we would see a constant importance shift from Merch to Movies. This data sadly only shows the sum of all revenue ever created.

Thanks for reading!

The LatestT