How to Recode a Column with dplyr in R?

Sometimes, when working with a dataframe, you may want the values of a variable/column of interest in a specific way. You might like to change or recode the values of the column.

R offers many ways to recode a column. Here we will see a simple example of recoding a column with two values using dplyr, one of the toolkits from tidyverse in R.

dplyr has a function recode, the lets you change a columns’ values.

Let us first load the dplyr library.

library(dplyr)

Let us make simple data frame to use recode function.

name <- c("John", "Clara", "Smith")
sex <- c(1,2,1)
age <- c(30,32,54)

We will create new dataframe using the above variables as columns.

# create a new dataframe from scratch
df <- data.frame(name,sex,age)
df

name sex age
<fctr> <dbl> <dbl>
John	1	30		
Clara	2	32		
Smith	1	54		

Note that, in the dataframe above, the column variable sex has values 1 and 2. We will use dplyr fucntions mutate and recode to change the values 1 & 2 to “Male” and “Female”.

df %>% mutate(sex=recode(sex, 
                         `1`="Male",
                         `2`="Female"))

name sex age
<fctr> <chr> <dbl>
John	Male	30		
Clara	Female	32		
Smith	Male	54	

recode() is useful to change factor variables as well. recode() will preserve the existing order of levels while changing the values. dplyr also has the function recode_factor(), which will change the order of levels to match the order of replacements. If you are interested in more complex factor level operations, then the awesome forcats package is the best bet.