Tidy Tuesday Week 1

I will try to participate in Tidytuesday every week to practice my Tidyverse skills, so I will be posting the graphs here as well to track my experience!

I tried to use pipes as much as I can this time around, since I am trying to become more of a functional programmer! If you are curious about Tidy Tuesday check-out the github repo! Let me know if you have any questions or suggestions! Here is week 2 of 2019:

library(tidyverse)
library(gganimate)
library(patchwork)
library(ggridges)
library(lubridate)
library(viridis)
library(magrittr)
library(knitr)
library(kableExtra)

data <- read_csv("../data/Tidytuesday/IMDb_Economist_tv_ratings.csv")
summary(data) %>% kable(caption = "Fig 1: Summary of The Data") %>%
  kable_styling(bootstrap_options =  "hover")
Table 1: Fig 1: Summary of The Data
titleIdseasonNumbertitledateav_ratingsharegenres
Length:2266Min. : 1.000Length:2266Min. :1990-01-03Min. :2.704Min. : 0.00Length:2266
Class :character1st Qu.: 1.000Class :character1st Qu.:2007-01-221st Qu.:7.7311st Qu.: 0.10Class :character
Mode :characterMedian : 2.000Mode :characterMedian :2012-12-07Median :8.115Median : 0.32Mode :character
NAMean : 3.264NAMean :2010-11-06Mean :8.061Mean : 1.28NA
NA3rd Qu.: 4.000NA3rd Qu.:2016-03-083rd Qu.:8.4903rd Qu.: 1.09NA
NAMax. :44.000NAMax. :2018-10-10Max. :9.682Max. :55.65NA
head(data) %>% kable(caption = "Fig 2: Extract from Data") %>%
  kable_styling(bootstrap_options =  "hover")
Table 1: Fig 2: Extract from Data
titleIdseasonNumbertitledateav_ratingsharegenres
tt2879552111.22.632016-03-108.48900.51Drama,Mystery,Sci-Fi
tt3148266112 Monkeys2015-02-278.34070.46Adventure,Drama,Mystery
tt3148266212 Monkeys2016-05-308.81960.25Adventure,Drama,Mystery
tt3148266312 Monkeys2017-05-199.03690.19Adventure,Drama,Mystery
tt3148266412 Monkeys2018-06-269.13630.38Adventure,Drama,Mystery
tt1837492113 Reasons Why2017-03-318.43702.38Drama,Mystery
colnames(data) <- c("Id", "Season Number", "Title", "Date", "Average Rating", "Share", "Genres") 
# data %>% transmute(Date = as.Date, "%y-%m-%d")
data %<>% mutate(Year = year(Date))

data %>% ggplot + 
    geom_density_ridges(aes(x = `Average Rating`, y = Year, group = Year, fill= Year)) +
    scale_fill_viridis(name = "Tail probability", direction = -1) +
    theme_bw() + 
    guides(fill = F) +
    coord_flip() + 
    labs(title = "IMDB Rating Distributions Over the Years") 

data %>% filter(Title %in%  ( data %>%  count(Title) %>% arrange(n %>% desc) %>% top_n(10,n) %>% pull(Title) )) %>% ggplot +
    geom_point(aes(x = Year, y = `Average Rating`, group = Title, color = Title)) + 
    geom_smooth(aes(x = Year, y = `Average Rating`,color = Title), method = "loess",fill = NA) +
    guides(color = F) + 
    labs(title = "IMDB Ratings of the 10 Longest TV Series over Time: \n{closest_state} ") +  
    transition_states(states = Title, state_length = 2)  + 
    ylab("Average IMDB Rating") + theme_bw()

comments powered by Disqus