```
#Generate 10000 random values
<-rnorm(10000,0,1)
randomValues1:10] randomValues[
```

```
[1] 0.3650247 0.7832691 0.3042406 -0.3987446 -1.4976881 1.1045793
[7] 1.8300398 -1.7650171 0.4452102 1.7652971
```

The **terra** package is used to deal with spatial data, but specifically with rasters. A major advantage of this package is its ability to work with large datasets. Rasters can include a lot of data; for example, a single coverage of a 5 km^{2} area at a 1m resolution contains 25 million grid cells. The **terra** package makes handling data at these volumes more manageable.

A raster is closely related to another object called a matrix, which is a rectangular array of numbers. They are used in mathematics for mapping relationships between linear systems, and serve a number of functions in computer science.

For our purposes, we can think think of them as a geographic coordinate space, where the column and row numbers are equivalent to x position and y position, and every value in that space is a measurement of some variable (e.g., temperature, elevation, etc.)

To show how this works, we’re going to dip back into Base R for a moment. First, let’s use `rnorm`

to generate a set of 10,000 random values, normally distributed around a mean of 0 and with a standard deviation of 1.

```
#Generate 10000 random values
<-rnorm(10000,0,1)
randomValues1:10] randomValues[
```

```
[1] 0.3650247 0.7832691 0.3042406 -0.3987446 -1.4976881 1.1045793
[7] 1.8300398 -1.7650171 0.4452102 1.7652971
```

Here, we’re using square brackets to look at the first ten of these random values. Now let’s say we want to take this and turn it into a matrix with 100 rows and 100 columns. We can use our **randomValues** as an argument in the `matrix`

function, along with arguments for the number of rows (**nrow**) and columns (**ncol**):

```
#Generate 10000 random values in a 10x10 matrix
<-matrix(randomValues,nrow=100,ncol=100)
randomMatrix#Show first ten rows of first three columns
1:10,1:3] randomMatrix[
```

```
[,1] [,2] [,3]
[1,] 0.3650247 0.52278568 1.2027860
[2,] 0.7832691 -0.75081240 0.5711899
[3,] 0.3042406 -0.04451878 1.8237723
[4,] -0.3987446 -0.82064860 -0.9432617
[5,] -1.4976881 1.83387313 1.2915272
[6,] 1.1045793 -1.07696849 -0.1493880
[7,] 1.8300398 -0.22584977 1.4808539
[8,] -1.7650171 -0.56163478 -0.8248841
[9,] 0.4452102 1.29103069 0.8245471
[10,] 1.7652971 1.27003436 1.5546683
```

This gives us a sense of what the matrix looks like: the far left shows row numbers, while the top has column numbers, both in square brackets. The numbers in between are the random values we generated, but now organized in a 100 x 100 matrix. We can see how many rows and columns are in the matrix using the `dim`

function:

`dim(randomMatrix)`

`[1] 100 100`

OK, now that we have data, we want to turn it into a raster.

```
#Turn matrix into SpatRaster object
<-rast(randomMatrix)
randomRaster randomRaster
```

```
class : SpatRaster
dimensions : 100, 100, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 100, 0, 100 (xmin, xmax, ymin, ymax)
coord. ref. :
source(s) : memory
name : lyr.1
min value : -3.818598
max value : 3.684048
```

This gives us some information about the spatRaster object, including:

dimensions (100 rows, 100 columns, 1 layer)

resolution (the size of a single cell, here 1x1)

extent (like a bounding box, giving the maximum and minimum x and y values)

coord. ref (CRS, not yet defined here)

name of the variable (lyr.1 by default since we didn’t specify), and the minimum and maximum values.

One of these properties worth mentioning is resolution, which is the size of a grid cell. Grids with smaller cells are better resolved, but this comes at a computational cost of recording, storing, and manipulating much more data. Anyone dealing with raster data must therefore make choices about what resolution is necessary for their purposes, and anyone later using that data must account for the resolution of the data.

If we want to visualize these using **ggplot2**, we need an appropriate geom. The `geom_spatraster`

function comes from **tidyterra**, which is built to simplify interactionse between **terra** and **tidyverse**:

```
ggplot()+
geom_spatraster(data=randomRaster)
```

The syntax is very similar to what we saw with `geom_s`

f, but this visualization shows what raster data looks like: gridded cells where the x and y position of each cell is determined by its column and row number, and its color is based on the value assigned at that position. Of course, being randomly generated data, it doesn’t show a pattern.

Right now, our random data are distributed in an abstract 100x100 coordinate space. If we want to make our data useful for understanding the world, we need to use coordinates based on a reference system. Like `sf`

objects, `spatRaster`

objects also need a CRS to do this.

The `crs`

function lets us do this; however, it requires a character value rather than a number. We can still use the EPSG codes we learned about last week, but this just needs to be preceded by *epsg:* and put into quotation marks, like so:

```
crs(randomRaster) <- "epsg:4326"
randomRaster
```

```
class : SpatRaster
dimensions : 100, 100, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 100, 0, 100 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source(s) : memory
name : lyr.1
min value : -3.818598
max value : 3.684048
```

Now when we plot the data, the coordinates (position in the matrix) are given as degrees longitude and latitude:

```
ggplot()+
geom_spatraster(data=randomRaster)
```

Try it yourself!

Try and run through the example above again, but when you create the randomValues vector, use the sort function to sort the values from least to greatest. What do you think will happen?

One more thing to be aware of is *multirasters*: these are raster datasets with multiple layers. These can be useful for storing different kinds of data within the same spatRaster object, or storing raster data that’s been collected over time. To do this, we’ll generate three random rasters, and then combine them just as we would a vector:

```
#Generate 100 random values in a 10x10 matrix
<-matrix(runif(100,0,1),nrow=10,ncol=10)
randomValues1<-matrix(runif(100,0,1),nrow=10,ncol=10)
randomValues2<-matrix(runif(100,0,1),nrow=10,ncol=10)
randomValues3
<-rast(randomValues1)
raster1<-rast(randomValues2)
raster2<-rast(randomValues3)
raster3
<-c(raster1,raster2,raster3) multiRaster
```

We can see how many layers are in this dataset using the `nlyr`

function:

`nlyr(multiRaster)`

`[1] 3`

When we look at this data, though, each raster in the set has the same default name:

` multiRaster`

```
class : SpatRaster
dimensions : 10, 10, 3 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 10, 0, 10 (xmin, xmax, ymin, ymax)
coord. ref. :
source(s) : memory
names : lyr.1, lyr.1, lyr.1
min values : 0.002778357, 0.00704341, 0.001061728
max values : 0.993688135, 0.99614789, 0.997564374
```

We can access layer names using the names function, and assign a vector of layer names as character values:

```
names(multiRaster)<-c("raster1","raster2","raster3")
multiRaster
```

```
class : SpatRaster
dimensions : 10, 10, 3 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 10, 0, 10 (xmin, xmax, ymin, ymax)
coord. ref. :
source(s) : memory
names : raster1, raster2, raster3
min values : 0.002778357, 0.00704341, 0.001061728
max values : 0.993688135, 0.99614789, 0.997564374
```

Visualizing all the layers here requires us to use the `facet_wrap`

function, and the argument we use is `~lyr`

. This is accessing the **lyr** (layer) property of the **spatRaster** object.

```
ggplot() +
geom_spatraster(data = multiRaster) +
facet_wrap(~lyr)
```

If you want to extract a single raster from this set, you can use the `$`

operator:

```
<-multiRaster$raster1
firstRaster
ggplot() +
geom_spatraster(data = firstRaster)
```

Or you can you use square brackets:

```
<-multiRaster['raster2']
secondRaster
ggplot() +
geom_spatraster(data = secondRaster)
```

While building rasters from scratch helps us understand the object’s properties, most uses in data science will involve loading in raster data from a file. Raster data can be read into R from a number of different file formats (check here for a full list), but generally speaking most common image formats can be read as rasters (e.g., .jpg, .bmp, .img). One of the most common formats used to store raster data is a GeoTIFF (.tif) file.

```
<-rast("data/turkanaDEM.tif")
turkanaDEM turkanaDEM
```

```
class : SpatRaster
dimensions : 720, 720, 1 (nrow, ncol, nlyr)
resolution : 0.004166667, 0.004166667 (x, y)
extent : 35, 38, 2, 5 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : turkanaDEM.tif
name : turkanaDEM
```

Like the data we created, we can see the properties of this dataset like its extent and resolution. Plotting is also the same as above using `geom_spatraster`

:

```
ggplot() +
geom_spatraster(data=turkanaDEM)
```

Here, we can see the elevation values mapped out, with a few peaks extending over 2000 meters a.s.l.

Multilayer rasters can also be stored as GeoTIFFs, and satellite images are often stored this way and include multiple layers for different color bands. Another common standard used in the earth sciences for multilayer rasters is a NetCDF (.nc) file. Here, we’ll load in a NetCDF of monthly rainfall values for the Lake Turkana area.

```
<-rast("data/turkanaRain.nc")
turkanaRain turkanaRain
```

```
class : SpatRaster
dimensions : 72, 72, 756 (nrow, ncol, nlyr)
resolution : 0.04166667, 0.04166667 (x, y)
extent : 35, 38, 2, 5 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : turkanaRain.nc
names : turka~ain_1, turka~ain_2, turka~ain_3, turka~ain_4, turka~ain_5, turka~ain_6, ...
```

And we can check the number of layers:

`nlyr(turkanaRain)`

`[1] 756`

Multilayer rasters can be subset using the `select`

function (thanks to **tidyterra**!):

```
<-select(turkanaRain,c('turkanaRain_1':'turkanaRain_4'))
turkanaRain4 turkanaRain4
```

```
class : SpatRaster
dimensions : 72, 72, 4 (nrow, ncol, nlyr)
resolution : 0.04166667, 0.04166667 (x, y)
extent : 35, 38, 2, 5 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source : turkanaRain.nc
names : turkanaRain_1, turkanaRain_2, turkanaRain_3, turkanaRain_4
```

Raster data operates similarly to the 2D bin objects we’ve seen before, so all of the `scale_fill_*`

functions apply here. For example, scale_fill_gradient will plot between low and high values:

```
ggplot() +
geom_spatraster(data=turkanaDEM) +
scale_fill_gradient(low="blue",high="red")
```

Or we can use `scale_fill_gradient2`

to use divergent colors, here using a midpoint of 1000m:

```
ggplot() +
geom_spatraster(data=turkanaDEM) +
scale_fill_gradient2(low="red",mid="yellow",high="darkgreen",midpoint=1000)
```

By the way, if we want to generate summary statistic about the cell values in our raster, we can access these using the **values** function:

```
<-median(values(turkanaDEM))
medElev
ggplot() +
geom_spatraster(data=turkanaDEM) +
scale_fill_gradient2(low="red",mid="yellow",high="darkgreen",midpoint=medElev)
```

```
<-median(values(turkanaDEM))
medElev
ggplot() +
geom_spatraster(data=turkanaDEM) +
scale_fill_viridis_c(option="magma")
```

With elevation data, though, we usually want to apply colors to particular elevation levels. Remembering back to Week 7, we can do this with our friend `rescale`

from the **scales** package:

`library(scales)`

```
Attaching package: 'scales'
```

```
The following object is masked from 'package:terra':
rescale
```

```
The following object is masked from 'package:purrr':
discard
```

```
The following object is masked from 'package:readr':
col_factor
```

```
<-min(values(turkanaDEM))
mn<-max(values(turkanaDEM))
mx<-rescale(c(mn,500,1000,1500,2000,mx),to=c(0,1)) rescaledElev
```

Here what we’ve done is first create variables storing the minimum and maximum values from the raster. Then we used `rescale`

to create a vector of values between that minimum and maximum, but scaled between 0 and 1. Now we can use these to control the plot with `scale_fill_gradientn`

:

```
ggplot() +
geom_spatraster(data=turkanaDEM) +
scale_fill_gradientn(colors=c("darkgreen","yellow","brown","tan","white"),values=rescaledElev)
```

However, we also know that there’s a lake in there somewhere, so we might want to indicate those colors with blue. The surface elevation of Lake Turkana is 360 meters, so we can set a hard boundary by adding two values after the minimum: 360 and 361

`<-rescale(c(mn,360,361,500,1000,1500,2000,mx),to=c(0,1)) rescaledElev`

Now when we run the ggplot, we can add two new colors: dark blue and blue, so our values will scale up to blue shades only through to 361

```
ggplot() +
geom_spatraster(data=turkanaDEM) +
scale_fill_gradientn(colors=c("darkblue","blue","darkgreen","yellow","brown","tan","white"),values=rescaledElev) +
theme_minimal() +
labs(fill="Elevation (masl)")
```

The same principles work for multilayer rasters. We can apply these individually:

```
ggplot() +
geom_spatraster(data=turkanaRain$turkanaRain_4) +
scale_fill_distiller(palette = "RdBu",direction=1)
```

Or we can plot values across multiple rasters using `facet_wrap`

:

```
ggplot() +
geom_spatraster(data=turkanaRain4) +
facet_wrap(~lyr) +
scale_fill_distiller(palette = "RdBu",direction=1,na.value="black")
```

Note that bit at the end where it reads `na.values="black"`

. This can be used in any scale_color or scale_fill function to account data points that have NA values. It’s important to make sure that your NA values are colored using a value that does not appear in your scale. However, here we have no NA values, so there are no black cells.

Raster algebra is the task of modifying values . You could think of this as akin to the mutate function. For example, let’s say we wanted to convert our elevation data, currently in meters above sea level, to feet. The conversion from meters to feet is:

\[ m \times 3.28 \]

To apply this across our raster, we simply multiply it by 3.28.

```
#convert m to feet
<-turkanaDEM*3.28 turkanaDEM_ft
```

We can also use raster algebra on more than one raster. For example, let’s say we wanted to sum the rainfall for the first four months in the rainfall data. We can use our turkanaRain4 data from earlier and sum the values in the four layers:

```
#convert m to feet
<-sum(turkanaRain4)
turkanaRain_JanApr turkanaRain_JanApr
```

```
class : SpatRaster
dimensions : 72, 72, 1 (nrow, ncol, nlyr)
resolution : 0.04166667, 0.04166667 (x, y)
extent : 35, 38, 2, 5 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
source(s) : memory
name : sum
min value : 97.1
max value : 414.9
```

As you can see, this creates a new raster based on the sum of the four layers in the dataset.

Try it yourself!

Try seeing if you can take the mean of the first twelve layers of the rainfall data and convert it into inches.