The Rise of Plastics in Our Oceans: A Guide to R Shiny
Written on
Chapter 1: Introduction to Ocean Plastics and R Shiny
In this article series, we will delve into the growing issue of plastics in our oceans while simultaneously learning to create a web application using R Shiny. A recent publication in the journal Nature detailed a study conducted by marine scientists who examined data from Continuous Plankton Recorders (CPRs) dating back to the 1930s. These devices, which are towed by ships, primarily collect plankton for research but also inadvertently started capturing plastic debris beginning in 1957.
For the first time in that year, a CPR was found to contain a man-made plastic item, opening the door to a new avenue of research. The scientists compiled a comprehensive dataset documenting instances of plastics found in CPRs across the North Atlantic, Arctic, and Northern European seas, from the inaugural discovery in 1957 up until 2016. This dataset includes 208 records of various plastic types, including fishing nets and plastic bags, and is now publicly accessible.
As I explored this dataset, I recognized its potential as a teaching tool for developing a data exploration app using R Shiny. Its manageable size and diverse data types—dates, descriptive text, and geographical coordinates—make it an excellent resource for learning.
To fully engage with this series, some foundational knowledge in data manipulation with R is necessary, particularly familiarity with RStudio. While prior experience with RMarkdown, Shiny, and ggplot2 is beneficial, it is not mandatory. Before you begin, ensure you have the following packages installed: shiny, RMarkdown, ggplot2, tidyverse, and leaflet.
In this series, you will learn to:
- Design a user-friendly web dashboard in R Markdown and prepare your data.
- Understand the reactivity concepts in Shiny apps and how inputs and outputs interact.
- Create interactive charts that respond to user input.
- Visualize coordinate data on maps.
- (Advanced) Generate animated time series graphics.
- Publish your app for public access.
To view the final product of this series, visit [here](#). The complete code will also be available on GitHub [here](#). Below is a glimpse of an animated timeline showcasing all recorded instances of plastics discovered in CPRs since 1957.
Chapter 2: Understanding Shiny
Shiny is an open-source R package that offers a sophisticated framework for building web applications using R. It is particularly valuable for analysts who receive frequent requests for statistics based on confidential data. Instead of manually performing analyses for each request, an analyst can develop a self-service app where users can select filters and view results instantly.
Using Shiny, R programmers can facilitate data exploration via the web, allowing developers to control what data is accessible and in what format. Furthermore, Shiny simplifies the process of publishing reactive analytics without the need for extensive knowledge of JavaScript.
Designing Your Dashboard
The design of your dashboard should be aligned with its intended use. If the app is aimed at a broad audience, it's crucial to adhere to good UX design principles and gather user feedback. However, since this project is more exploratory, we can tailor the design based on our preferences.
To kick off our design process, we should analyze the dataset to identify what information might engage users. Here’s a brief overview of the CPR data we will work with:
The 'Observation' column reveals the type of plastic detected, allowing us to categorize the data (similar to the analysis in the Nature article). The 'Year of tow' column offers a timeline perspective, while coordinates help us visualize the locations of plastic findings. Additionally, maritime region names can be useful for filtering purposes.
We plan to incorporate the following features in our app:
- Filtering options based on plastic type and maritime region.
- Year range filters for specific analyses.
- Basic statistical insights on plastic incidences in CPRs, guided by the Nature article's analysis.
- Geographic visualization of incidents over time.
Our dashboard will feature a GLOBAL SIDEBAR that remains visible throughout the app, providing context and filtering options. There will be three separate pages focusing on different data aspects:
- STATS: Showcasing descriptive statistics and filtering options.
- LOCATIONS: Mapping the incident sites.
- TIME: Visualizing the temporal distribution of incidents.
Preparing the Data for the App
With our dashboard design outlined, the next step is to prepare the dataset for application use. This involves two key actions.
First, we need to enrich the dataset by adding columns necessary for our planned analyses. Although the dataset is fairly complete, we need to categorize the 'Observation' data and streamline the column names for easier manipulation.
To get started, create a new RStudio project named cpr_data and a subfolder called data for the researchers' original xlsx file. We will write a script to add the 'type' column and clean up the column names. Below is a sample script to prepare the data:
# Prepare data for CPR app
# Load libraries
library(dplyr)
library(openxlsx)
# Load original data file and simplify column names
data <- openxlsx::read.xlsx("data/Supplementary_Data_1.xlsx", sheet = 1, startRow = 2)
colnames(data) <- gsub("[.]", "", colnames(data)) %>% tolower()
colnames(data)[grepl("region", colnames(data))] <- "region"
# Classify incidents by key terms
data <- data %>%
dplyr::mutate(
type = dplyr::case_when(
grepl("net", observation, ignore.case = TRUE) ~ "Netting",
grepl("line|twine|fishing", observation, ignore.case = TRUE) ~ "Line",
grepl("rope", observation, ignore.case = TRUE) ~ "Rope",
grepl("bag|plastic", observation, ignore.case = TRUE) ~ "Bag",
grepl("monofilament", observation, ignore.case = TRUE) ~ "Monofilament",
grepl("string|cord|tape|binding|fibre", observation, ignore.case = TRUE) ~ "String",
TRUE ~ "Unclassified"
)
)
# Save as an RDS file
saveRDS(data, "data/data.RDS")
In this script, grepl() identifies specific terms within the 'Observation' column, and dplyr::case_when() assigns corresponding types. If no terms match, the type defaults to "Unclassified." We also standardize column names to simple lowercase strings for easier coding.
Next, we need to save the transformed dataset. Once our app is complete, this data file will be bundled with it for user access. Given the small size of our dataset, we can opt for a straightforward file format, so we will save it as an R object in an RDS file.
Next Time…
Now that we have our design and data prepared, the next installment will focus on building the basic structure of the dashboard. We will also explore managing user inputs, reactive variables, and creating foundational descriptive plots using ggplot2 that respond dynamically to user selections.
Exercises
Here are some exercises to reinforce your understanding of this article:
- What is R Shiny, and how can it be utilized with this dataset?
- How would you approach designing this dashboard for a large, diverse user base?
- What considerations are important when using a local dataset in an R Shiny app?
- After reading the Nature article that inspired this dataset, what additional design ideas might you propose for the dashboard?
Originally, I was a Pure Mathematician who transitioned to Psychometrics and Data Science. I am dedicated to applying the rigor of these disciplines to complex human questions. As a coding enthusiast and a fan of Japanese RPGs, you can connect with me on LinkedIn or Twitter.
The first video titled "Plastic Ocean" explores the impact of plastic pollution in marine environments. It highlights the urgency of addressing this environmental crisis and the role of scientific research in understanding the issue.
The second video, "Plastics in our oceans," delves into the various types of plastics found in marine ecosystems, offering insights into their sources and effects on wildlife and habitats.