Environmental Data Analysis & Visualization

Does anybody really know what time it is?

Warm-up exercise

Create a new file system and Quarto document for this lecture. Download the walkability.csv dataset. Using pipes, create a dataset of with the following variables:

• Core-Based Statistical Area (CBSA) name (excluding the state)

• Population (only above 5 million)

• National Walkability Index

Show the distribution of walkability index scores for all CBSAs in this new dataset

Warm-up exercise

walkData<-read_csv("data/walkability.csv")

walkData %>%
select(CBSA_Name,CBSA_POP,NatWalkInd) %>%
separate(CBSA_Name,into=c("cities","states"),sep=",",extra="drop") %>%
filter(CBSA_POP>5000000) %>%
ggplot(aes(x=reorder(cities,NatWalkInd,median),y=NatWalkInd)) +
geom_boxplot() +
coord_flip() +
labs(x="Walkability Index",y="CBSA") 

New York Times

Visualization critique

https://www.toptal.com/designers/data-visualization/data-visualization-best-practices

Visualization critique

https://experience.arcgis.com/experience/be9006c30a2148f595693066441fb8eb/page/Map/

Dataset of the day

Stock market data

Yahoo! Finance

When does “when” matter?

NASA Scientific Visualization Studio (https://svs.gsfc.nasa.gov/5137/)

Time: what is it?

R can store time as character values.

• "2023-10-19"

• "10/19/2023"

• "17:05:00"

Time: what is it?

It can also store some kinds of time data as numbers:

• 2023

• 10

• 19

Time: what is it?

Time as data is a strange beast.

Time by itself isn’t something that varies in a meaningful way.

Time has a defined order, but you can’t really count time, nor can you really treat it like a number.

Time: what is it?

R can also recognize time as particular objects like date and date-time objects. For example:

#give today's date as a date object
today()
[1] "2023-10-20"
#give the precise time as a date-time object
now()
[1] "2023-10-20 12:51:43 EDT"

Handling time data

The lubridate package lets us convert between other values and time objects.

library(lubridate)

Handling time data

#storing the date as a character value
todaysDate<-"2023-10-19"

todaysDate
[1] "2023-10-19"
typeof(todaysDate)
[1] "character"

Handling time data

# converting the date to a date object
ymd(todaysDate)
[1] "2023-10-19"
typeof(ymd(todaysDate))
[1] "double"

Visualizing time data

NASDAQ Composite data

nasdaq<-read_csv("data/nasdaq.csv")
nasdaq
# A tibble: 65 × 7
Date        Open   High    Low  Close Adj Close     Volume
<chr>      <dbl>  <dbl>  <dbl>  <dbl>       <dbl>      <dbl>
1 7/18/2023 14212. 14397. 14176. 14354.      14354. 4824070000
2 7/19/2023 14399. 14447. 14317. 14358.      14358. 5112420000
3 7/20/2023 14273. 14310. 14031. 14063.      14063. 5128020000
4 7/21/2023 14148. 14179. 14020. 14033.      14033. 5254180000
5 7/24/2023 14082. 14110. 13997. 14059.      14059. 4083070000
6 7/25/2023 14093. 14202. 14093. 14145.      14145. 3812470000
7 7/26/2023 14124. 14187. 14042. 14127.      14127. 4322000000
8 7/27/2023 14319. 14360. 14007. 14050.      14050. 5115840000
9 7/28/2023 14200. 14344. 14188. 14317.      14317. 4453520000
10 7/31/2023 14338. 14371. 14293. 14346.      14346. 4934440000
# ℹ 55 more rows

Visualizing time data

ggplot(nasdaq,aes(x=Date,y=Open)) +
geom_line(color="darkblue") +
labs(x="Date",y="Points at Closing",title="NASDAQ Composite, July-October 2023") +
theme_bw()

Visualizing time data

ggplot(nasdaq,aes(x=mdy(Date),y=Open)) +
geom_line(color="darkblue") +
labs(x="Date",y="Points at Closing",title="NASDAQ Composite, July-October 2023") +
theme_bw()

Activity: Visualizing time data

Load the openintro package to access the nycflights data

Create a pipe that

• subsets the data to Delta Airlines (DL) flights only

• combines the year, month, and day into a single date column

• convert to a table of counts using the count function

Plot this new data as a line graph using geom_line.

Activity: Visualizing time data

nycflights2<-nycflights

nycflights2 |>
filter(carrier=="DL") |>
unite(col="Date",year,month,day,sep="-",na.rm = TRUE ) |>
count(Date)|>
ggplot(aes(x=ymd(Date),y=n)) +
geom_line(color="darkblue") 

Next time

• Data storytelling

• Introducing sf

• Making and plotting vector data