Sometimes you want to change the order of columns in a dataframe. dpyr’s relocate() function makes it easy to move one or more columns to new positions easily by using the same syntax as select() function. In this post we will see 7 tips to change column order or column position using dplyr’s relocate().
Let us get started by loading the packages needed.
library(tidyverse) # check the dplyr version as relocate is a recent function packageVersion("dplyr")
The basic usage of dplyr’s relocate() function is like this. We provide data and the variable(s) of interest to relocate. Variable(s) of interest can be the name the column or multiple columns selected by helper select functions like starts_with() and where(). The remaining two arguments, .before and .after are optional and help us move a column or columns
# relocate usage relocate(.data, ..., .before = NULL, .after = NULL)
We will use subset of starwars data readily available with tidyverse packages.
df <- starwars %>% select(1:5)
df %>% head() ## # A tibble: 6 × 5 ## name height mass hair_color skin_color ## <chr> <int> <dbl> <chr> <chr> ## 1 Luke Skywalker 172 77 blond fair ## 2 C-3PO 167 75 <NA> gold ## 3 R2-D2 96 32 <NA> white, blue ## 4 Darth Vader 202 136 none white ## 5 Leia Organa 150 49 brown light ## 6 Owen Lars 178 120 brown, grey light
Move a column to the first
To move a column to front of the dataframe we use relocate() function with the column name that we want to move to the front.
In the example below we are move the hair_color column to be the first column in the dataframe using relocate() function.
df %>% relocate(hair_color) ## # A tibble: 87 × 5 ## hair_color name height mass skin_color ## <chr> <chr> <int> <dbl> <chr> ## 1 blond Luke Skywalker 172 77 fair ## 2 <NA> C-3PO 167 75 gold ## 3 <NA> R2-D2 96 32 white, blue ## 4 none Darth Vader 202 136 white ## 5 brown Leia Organa 150 49 light ## 6 brown, grey Owen Lars 178 120 light ## 7 brown Beru Whitesun lars 165 75 light ## 8 <NA> R5-D4 97 32 white, red ## 9 black Biggs Darklighter 183 84 light ## 10 auburn, white Obi-Wan Kenobi 182 77 fair ## # … with 77 more rows
Before relocate() was available, we would using select() function in combination everything() function. For example, here we are moving hair_color column to the from by specifying it as the first argument to select() function and the remaining all columns using everything() as second argument.
df %>% select(hair_color, everything()) ## # A tibble: 87 × 5 ## hair_color name height mass skin_color ## <chr> <chr> <int> <dbl> <chr> ## 1 blond Luke Skywalker 172 77 fair ## 2 <NA> C-3PO 167 75 gold ## 3 <NA> R2-D2 96 32 white, blue ## 4 none Darth Vader 202 136 white ## 5 brown Leia Organa 150 49 light ## 6 brown, grey Owen Lars 178 120 light ## 7 brown Beru Whitesun lars 165 75 light ## 8 <NA> R5-D4 97 32 white, red ## 9 black Biggs Darklighter 183 84 light ## 10 auburn, white Obi-Wan Kenobi 182 77 fair ## # … with 77 more rows
Move a column A to After column B
To move a column A to a position right after another column B, we will use the .after argument as follows. Here, we are changing the location of hair_color variable to a location right after the column “height.
df %>% relocate(hair_color, .after=height) ## # A tibble: 87 × 5 ## name height hair_color mass skin_color ## <chr> <int> <chr> <dbl> <chr> ## 1 Luke Skywalker 172 blond 77 fair ## 2 C-3PO 167 <NA> 75 gold ## 3 R2-D2 96 <NA> 32 white, blue ## 4 Darth Vader 202 none 136 white ## 5 Leia Organa 150 brown 49 light ## 6 Owen Lars 178 brown, grey 120 light ## 7 Beru Whitesun lars 165 brown 75 light ## 8 R5-D4 97 <NA> 32 white, red ## 9 Biggs Darklighter 183 black 84 light ## 10 Obi-Wan Kenobi 182 auburn, white 77 fair ## # … with 77 more rows
Move a column A to Before column B
Similarly, using .before argument we can move a column A to a position right before another column B. In the example below, we are changing the location of hair_color variable to a location right before the column “height.
df %>% relocate(hair_color, .before=height) ## # A tibble: 87 × 5 ## name hair_color height mass skin_color ## <chr> <chr> <int> <dbl> <chr> ## 1 Luke Skywalker blond 172 77 fair ## 2 C-3PO <NA> 167 75 gold ## 3 R2-D2 <NA> 96 32 white, blue ## 4 Darth Vader none 202 136 white ## 5 Leia Organa brown 150 49 light ## 6 Owen Lars brown, grey 178 120 light ## 7 Beru Whitesun lars brown 165 75 light ## 8 R5-D4 <NA> 97 32 white, red ## 9 Biggs Darklighter black 183 84 light ## 10 Obi-Wan Kenobi auburn, white 182 77 fair ## # … with 77 more rows
Move a column to last position
We can move a column to the last position in the dataframe using .after argument and specify the current last column using last_col() function.
df %>% relocate(mass, .after=last_col()) ## # A tibble: 87 × 5 ## name height hair_color skin_color mass ## <chr> <int> <chr> <chr> <dbl> ## 1 Luke Skywalker 172 blond fair 77 ## 2 C-3PO 167 <NA> gold 75 ## 3 R2-D2 96 <NA> white, blue 32 ## 4 Darth Vader 202 none white 136 ## 5 Leia Organa 150 brown light 49 ## 6 Owen Lars 178 brown, grey light 120 ## 7 Beru Whitesun lars 165 brown light 75 ## 8 R5-D4 97 <NA> white, red 32 ## 9 Biggs Darklighter 183 black light 84 ## 10 Obi-Wan Kenobi 182 auburn, white fair 77 ## # … with 77 more rows
Move all numerical columns to the front
So far, we have only mover a single column of interest to specific locations. With relocate() we can also change the positions of multiple columns at once. The basic idea is to use one of the helper select functions to specify the columns to move.
For example, to move all the numerical columns to the front, we will select all numerical columns using where(is.numeric) and provide them as argument to relocate function.
df %>% relocate(where(is.numeric)) ## # A tibble: 87 × 5 ## height mass name hair_color skin_color ## <int> <dbl> <chr> <chr> <chr> ## 1 172 77 Luke Skywalker blond fair ## 2 167 75 C-3PO <NA> gold ## 3 96 32 R2-D2 <NA> white, blue ## 4 202 136 Darth Vader none white ## 5 150 49 Leia Organa brown light ## 6 178 120 Owen Lars brown, grey light ## 7 165 75 Beru Whitesun lars brown light ## 8 97 32 R5-D4 <NA> white, red ## 9 183 84 Biggs Darklighter black light ## 10 182 77 Obi-Wan Kenobi auburn, white fair ## # … with 77 more rows
Move all numerical columns to the back
Similarly we can move all the numnerical columns to be the last columns in the dataframe using the last_col() function for .after argument.
df %>% relocate(where(is.numeric), .after=last_col()) ## # A tibble: 87 × 5 ## name hair_color skin_color height mass ## <chr> <chr> <chr> <int> <dbl> ## 1 Luke Skywalker blond fair 172 77 ## 2 C-3PO <NA> gold 167 75 ## 3 R2-D2 <NA> white, blue 96 32 ## 4 Darth Vader none white 202 136 ## 5 Leia Organa brown light 150 49 ## 6 Owen Lars brown, grey light 178 120 ## 7 Beru Whitesun lars brown light 165 75 ## 8 R5-D4 <NA> white, red 97 32 ## 9 Biggs Darklighter black light 183 84 ## 10 Obi-Wan Kenobi auburn, white fair 182 77 ## # … with 77 more rows
We can immediately see that we can use other helper select functions like starts_with() and ends_with() to relocate specific columns.