Beyond subsetting, other functions from dplyr work to restructure the data by changing the arrangement of rows and columns, or by modifying their contents.
Mutate
Mutate is a useful function that creates new columns by calculating values from existing columns. For example, we can add a column of DBH values in inches this way:
Here we’ve created a new variable called DBHin by multiplying all the values in DBH (cm) by 2.54. We can also create new values by combining values across multiple columns. For example, if we wanted to get crown depth, which is the difference between the total tree height and the crown height, we can subtract these using a mutate function:
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The important thing to remember about these kinds of operations is that they are happening row-wise, so the values in the new column will correspond to the position of values in the columns used. For example, the first tree height value is 2 meters, while the first crown height value is 0.5. When the latter is subtracted from the former, we get a value of 1.5, and this becomes the first value in our new column.
Bind
Sometimes you may want to combine two datasets into a single table. Let’s say we had two tables of different maple trees:
We can put these two together using the bind_rows function:
mapleTrees<-bind_rows(sugarMaples,redMaples)
The bind_cols function works similarly, but instead binds new columns to an existing table. For example, let’s say we had our maple tree data in two pieces; information about streets and information species names:
# A tibble: 246 × 7
DbaseID `Park/Street` OnStreet FromStreet ToStreet ScientificName CommonName
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 3936 Street -1 -1 -1 Acer saccharum Sugar maple
2 4014 Street -1 -1 -1 Acer saccharum Sugar maple
3 4101 Street -1 -1 -1 Acer saccharum Sugar maple
4 4150 Street -1 -1 -1 Acer saccharum Sugar maple
5 4206 Street -1 -1 -1 Acer saccharum Sugar maple
6 4268 Street -1 -1 -1 Acer saccharum Sugar maple
7 4295 Street -1 -1 -1 Acer saccharum Sugar maple
8 4348 Street -1 -1 -1 Acer saccharum Sugar maple
9 4353 Street -1 -1 -1 Acer saccharum Sugar maple
10 4371 Street -1 -1 -1 Acer saccharum Sugar maple
# ℹ 236 more rows
Try it yourself!
A distinguishing factor between the tidyverse bind_rows and its Base R equivalent (rbind) is that the Base R commands will only work with tables that have the same opposing dimension (same number of columns). The tidyverse versions will simply fill any unmatched columns or rows with NA values. Try it by combining the tree dataset with these datasets from the modeldata package