2  Transforming data

Beyond subsetting, other functions from dplyr work to restructure the data by changing the arrangement of rows and columns,

2.0.1 Mutate

Mutate is a useful function that creates new columns by calculating values from existing columns. For example, we can add a column of DBH values in inches this way.

treeData2<-mutate(treeData,DBHin=`DBH (cm)` / 2.54)
treeData2
# A tibble: 14,487 × 42
   DbaseID Region City   Source TreeID Zone  `Park/Street` SpCode ScientificName
     <dbl> <chr>  <chr>  <chr>   <dbl> <chr> <chr>         <chr>  <chr>         
 1       1 InlVal Modes… Motow…      1 Nurs… Nursery       ACSA1  Acer sacchari…
 2       2 InlVal Modes… Motow…      2 Nurs… Nursery       BEPE   Betula pendula
 3       3 InlVal Modes… Motow…      3 Nurs… Nursery       CESI4  Celtis sinens…
 4       4 InlVal Modes… Motow…      4 Nurs… Nursery       CICA   Cinnamomum ca…
 5       5 InlVal Modes… Motow…      5 Nurs… Nursery       FRAN_R Fraxinus angu…
 6       6 InlVal Modes… Motow…      6 Nurs… Nursery       FREX_H Fraxinus exce…
 7       7 InlVal Modes… Motow…      7 Nurs… Nursery       FRHO   Fraxinus holo…
 8       8 InlVal Modes… Motow…      8 Nurs… Nursery       FRPE_M Fraxinus penn…
 9       9 InlVal Modes… Motow…      9 Nurs… Nursery       FRVE_G Fraxinus velu…
10      10 InlVal Modes… Motow…     10 Nurs… Nursery       GIBI   Ginkgo biloba 
# ℹ 14,477 more rows
# ℹ 33 more variables: CommonName <chr>, TreeType <chr>, address <chr>,
#   street <chr>, side <chr>, cell <dbl>, OnStreet <chr>, FromStreet <chr>,
#   ToStreet <chr>, Age <dbl>, `DBH (cm)` <dbl>, `TreeHt (m)` <dbl>,
#   CrnBase <dbl>, `CrnHt (m)` <dbl>, `CdiaPar (m)` <dbl>,
#   `CDiaPerp (m)` <dbl>, `AvgCdia (m)` <dbl>, `Leaf (m2)` <dbl>,
#   Setback <dbl>, TreeOr <dbl>, CarShade <dbl>, LandUse <dbl>, Shape <dbl>, …

2.0.2 Bind

Sometimes you may want to combine two datasets into a single table. Let’s say we had two tables of different maple trees:

sugarMaples<-filter(treeData,CommonName=="Sugar maple")
redMaples<-filter(treeData,CommonName=="Red Maple")

We can put these two together using the bind_rows function:

mapleTrees<-bind_rows(sugarMaples,redMaples)

The bind_cols function works similarly, but instead binds new columns to an existing table. For example, let’s say we had our maple tree data in two pieces; information about streets and information species names:

mapleNames<-select(mapleTrees,ends_with("Name",ignore.case = TRUE))
mapleStreets<-select(mapleTrees,DbaseID,contains("Street",ignore.case = FALSE))

We can recombine these using the bind_cols function this way:

mapleData<-bind_cols(mapleStreets,mapleNames)
mapleData
# A tibble: 246 × 7
   DbaseID `Park/Street` OnStreet FromStreet ToStreet ScientificName CommonName 
     <dbl> <chr>         <chr>    <chr>      <chr>    <chr>          <chr>      
 1    3936 Street        -1       -1         -1       Acer saccharum Sugar maple
 2    4014 Street        -1       -1         -1       Acer saccharum Sugar maple
 3    4101 Street        -1       -1         -1       Acer saccharum Sugar maple
 4    4150 Street        -1       -1         -1       Acer saccharum Sugar maple
 5    4206 Street        -1       -1         -1       Acer saccharum Sugar maple
 6    4268 Street        -1       -1         -1       Acer saccharum Sugar maple
 7    4295 Street        -1       -1         -1       Acer saccharum Sugar maple
 8    4348 Street        -1       -1         -1       Acer saccharum Sugar maple
 9    4353 Street        -1       -1         -1       Acer saccharum Sugar maple
10    4371 Street        -1       -1         -1       Acer saccharum Sugar maple
# ℹ 236 more rows
Try it yourself!

A distinguishing factor between the tidyverse bind_rows and its Base R equivalent (rbind) is that the Base R commands will only work with tables that have the same opposing dimension (same number of columns). The tidyverse versions will simply fill any unmatched columns or rows with NA values. Try it by combining the tree dataset with these datasets from the modeldata package

  • crickets

  • penguins

  • Sacramento