# Environmental Data Analysis and Visualization

On Deliberate Visualization

## Warm-up activity

• Start up R Studio and open a new script. Load the palmerpenguins packages (install it if needed)

• Using ggplot2, create a graphic that shows the relationship between two variables: body mass and bill depth

• Modify the aesthetic mapping so that the geometry is colored based on species

• With the + operator, add a layer consisting of the labs function, and give it arguments for x, y, color, and title

## Sensor of the day

Mobile phones!

epSos.de, CC BY 2.0 <https://creativecommons.org/licenses/by/2.0>, via Wikimedia Commons

## Sensor of the day

https://www.kaggle.com/datasets

## Order of operations

ggplot(data=penguins,aes(body_mass_g,bill_depth_mm,color=species))+
geom_point()

## Order of operations

ggplot(data=penguins)+
geom_point(aes(body_mass_g,bill_depth_mm,color=species))

## Order of operations

ggplot()+
geom_point(data=penguins,aes(body_mass_g,bill_depth_mm,color=species))

## When does this matter?

• Where you add your aesthetic mapping will determine what geometries they connect to

• Where you add your layers will determine their order of drawing

## When does this matter?

ggplot(data=gm2007,aes(x = continent, y = lifeExp, color = continent)) +
geom_boxplot() +
geom_jitter()

## When does this matter?

ggplot(data=gm2007,aes(x = continent, y = lifeExp)) +
geom_jitter(aes(color = continent)) +
geom_boxplot()

## How to make a good visualization

• Choose the right chart for the data

• Maximize the data-to-ink ratio

• Make deliberate design decisions

## Scales: Being deliberate with color

What can color show effectively?

## Scales: Being deliberate with color

ggplot(penguins,aes(body_mass_g,bill_depth_mm,color=species))+
geom_point() +
labs(x="Body Mass (g)",y="Bill Depth (mm)",title="Penguin bill size by body mass") +
scale_color_manual(values=c("Purple","Orange","Green"))

## Scales: Being deliberate with color

library(ggthemes)

cvdCols <- c("#000000", "#E69F00", "#56B4E9")

ggplot(penguins,aes(body_mass_g,bill_depth_mm,color=species))+
geom_point() +
labs(x="Body Mass (g)",y="Bill Depth (mm)",title="Penguin bill size by body mass") +
scale_color_manual(values=cvdCols)

## Scales: Being deliberate with color

library(ggthemes)

ggplot(penguins,aes(body_mass_g,bill_depth_mm,color=species))+
geom_point() +
labs(x="Body Mass (g)",y="Bill Depth (mm)",title="Penguin bill size by body mass") +
scale_color_colorblind()

## Scales: Being deliberate with size

What sort of information does size suggest?

## Scales: Being deliberate with size

ggplot(gmAsia2007,aes(x=gdpPercap,y=lifeExp))+
geom_point() 

## Scales: Being deliberate with size

ggplot(gmAsia2007,aes(x=gdpPercap,y=lifeExp,size=pop))+
geom_point() 

## Scales: Being deliberate with axes

ggplot(gmAsia2007,aes(x=gdpPercap,y=lifeExp,size=pop))+
geom_point() +
scale_x_log10() 

## Activity: Deliberate penguins

• For this activity, use the penguins dataset to genreate a graphic. Use at least 3 variables in your plot.

• Try using some of the scale options located in the ggplot2 cheatsheet.

• What sorts of scales might help you clarify or emphasize different variables?

https://eagereyes.org/blog/2013/banking-45-degrees

https://eagereyes.org/blog/2013/baselines

https://infolific.com/technology/internet/seo-lie-factor/

## Data-ink and themes

ggplot(gmAsia2007,aes(x=gdpPercap,y=lifeExp,size=pop))+
geom_point() +
scale_x_log10() +
theme_classic()

## Themes and data-ink

While each graphical element can be modified individually, themes provide a way to modify the overall look of the “non-data ink”

ggplot(gmAsia2007,aes(x=gdpPercap,y=lifeExp,size=pop))+
geom_point() +
scale_x_log10()

## Themes and data-ink

ggplot(gmAsia2007,aes(x=gdpPercap,y=lifeExp,size=pop))+
geom_point() +
scale_x_log10()+
theme_bw()

## Themes and data-ink

ggplot(gmAsia2007,aes(x=gdpPercap,y=lifeExp,size=pop))+
geom_point() +
scale_x_log10()+
theme_classic()

## The big picture

• Visualization is foremost about making data more understandable

• Guidelines like maximizing data-ink and being deliberate about design help us make decisions that will facilitate this goal

• The grammar of graphics helps us to make these decisions in an explicit way by connecting elements

• The ggplot2 package provides a way for us to put that grammar to work inside of the data environment we’re creating in R

## Coursekeeping

• Visualization critiques begin next week

• Check the list on Canvas to see when your critique is due
• Coding assignment #1 is due on Thursday

## Next week

• Introducing data analysis

• Finding a statistic for assessing your data

• Visualizing stats