Lab 05 - Take a sad plot and make it better

Given below are two data visualizations that violate many data visualization best practices. Improve these visualizations using R and the tips for effective visualizations that we introduced in class. For exercises 4 and 6, you should produce one visualization per dataset. Your visualization should be accompanied by a brief paragraph describing the choices you made in your improvement, specifically discussing what you didn’t like in the original plots and why, and how you addressed them in the visualization you created.

In class on 6 October, you will give a brief presentation describing one of your improved visualizations and the reasoning for the choices you made. For this, it’s fine to just step through your markdown explaining the plot and code.

Learning goals

Getting started

Go to the course GitHub organization and locate your repo, clone it in RStudio and open the R Markdown document. Knit the document to make sure it compiles without errors.

Warm up

Before we introduce the data, let’s warm up with some simple exercises. Update the YAML of your R Markdown file with your information, knit, commit, and push your changes. Make sure to commit with a meaningful commit message. Then, go to your repo on GitHub and confirm that your changes are visible in your Rmd and md files. If anything is missing, commit and push again.

Packages

We’ll use the tidyverse package for much of the data wrangling and visualisation and the data lives in the dsbox package. Either load the library or the data in the lab repo.

library(tidyverse) 
library(dsbox) #this if it works
library(readr) #or this otherwise

instructors = read_csv("data/instructors.csv")
fisheries = read_csv("data/fisheries.csv")

Data

The datasets we’ll use are called instructors and fisheries from the dsbox package. If you can load the library, the datasets become available to us when we load the package. Otherwise, read in the data. You can find out more about the datasets by inspecting their documentation, which you can access by running ?instructors and ?fisheries in the Console or using the Help menu in RStudio to search for instructors or fisheries. You can also find this information here and here.

Exercises

Fisheries

Fisheries and Aquaculture Department of the Food and Agriculture Organization of the United Nations collects data on fisheries production of countries. This Wikipedia page lists fishery production of countries for 2016. For each country tonnage from capture and aquaculture are listed. Note that countries whose total harvest was less than 100,000 tons are not included in the visualization.

A researcher shared with you the following visualization they created based on these data. 😳

  1. Can you help them improve it? First, brainstorm how you would improve it. It’s ok if some of your improvements are aspirational, i.e. you don’t know how to implement it, but you think it’s a good idea.

Load the data.

fisheries
## # A tibble: 75 x 3
##    country    capture aquaculture
##    <chr>        <dbl>       <dbl>
##  1 Algeria     126259         368
##  2 Angola      240000          NA
##  3 Argentina   931472        2430
##  4 Australia   245935       47087
##  5 Bangladesh 1333866      882091
##  6 Brazil      750283      257783
##  7 Cambodia    384000       26000
##  8 Canada     1080982      154083
##  9 Chile      4330325      698214
## 10 Colombia    121000       60072
## # ... with 65 more rows
  1. Create a new data visualisation for these data that implements the improvements you proposed in the previous exercise (or many of them as you can).

🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards and review the md document on GitHub to make sure you’re happy with the final state of your work.

Wrapping up

Go back through your write up to make sure you’re following coding style guidelines we discussed in class. Make any edits as needed.

Also, make sure all of your R chunks are properly labelled, and your figures are reasonably sized.

Once the last team member for the week pushes their final changes, others should pull the changes and knit the R Markdown document to confirm that they can reproduce the report.

More ugly charts

Want to see more ugly charts?

Rubric

25 points total. * 5 questions @ 3 points for correct and complete answers * 5 points github commit history * 5 points coding style, R chunks are properly labelled, and your figures are reasonably sized.