• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / R / tidyverse / Create New Column with Tidyverse / 9 Ways To Create New Variables with tidyverse

9 Ways To Create New Variables with tidyverse

February 17, 2019 by cmdlinetips

Add New Variables With tidyverse
Add New Variables With tidyverse
When one wants to create a new variable in R using tidyverse, dplyr’s mutate verb is probably the easiest one that comes to mind that lets you create a new column or new variable easily on the fly. It is probably the go to command for every time one needed to make new variable for many people.

However, dplyr’s mutate is not the only way to create new variable. Tidyverse has a host of useful commands that can be extremely useful for create new variables in different scenarios.

In this post, we will see examples of 9 ways to create new variables with tidyverse.

Let us load tidyverse packages and gapminder package. We will use the gapminder data frame from the gapminder data frame.

library(tidyverse)
library(gapminder)

Let us filter gapminder dataframe so that we have just three columns/variables and just 4 rows of data.

gapminder <- gapminder %>%
  select(country,year,pop) %>%
  head(n=4)

1. Mutate

With the easy to use mutate verb, one can create a new variable, in this example, pop_in_mill from “pop” as follows. You can see that the resulting data frame has the new column “pop_in_mill”.

gapminder %>%
  mutate(pop_in_mill= pop/1e06)

country year pop pop_in_mill
<fctr> <int> <dbl> <dbl>
Afghanistan	1952	8425333	8.425333	
Afghanistan	1957	9240934	9.240934	
Afghanistan	1962	10267083	10.267083	
Afghanistan	1967	11537966	11.537966	

2. transmute

Sometimes, one may want to create a new variable, but not interested in the original variables that are present in the data frame. In those cases, relatively unknown tidyverse verb transmute is very useful. In this example, we create a new variable “pop_in_mill” with transmute. Note that the resulting data contains only the new variable, nothing else.

gapminder %>% 
  transmute(pop_in_mill=pop/1e06)

pop_in_mill
<dbl>
8.425333				
9.240934				
10.267083				
11.537966	

3. mutate_at

dplyr also has mutate_at verb that can very useful to make changes at a specific column in a data frame. In this simple example illustrating mutate_at, we specify the column we want to change and a function for how to change the variable. Note that it does create a name, instead new column with the same name.

gapminder %>% 
  mutate_at(c("pop"), function(x){x/1e6})

country year pop
<fctr> <int> <dbl>
Afghanistan	1952	8.425333		
Afghanistan	1957	9.240934		
Afghanistan	1962	10.267083		
Afghanistan	1967	11.537966	

The verb mutate_at can be extremely useful in the scenarios where you want to change multiple columns with some sort of pattern in their names with a certain rule.

4. mutate_if

The mutate_if is a very useful verb when one is interested in checking a condition and change the column if the condition is met. In the dummy example below, we use mutate_if to check if a column is of integer type and change it to character type.

Note that now the resulting data frame does not have any column with integer as type.

gapminder %>%
  mutate_if(is.integer, as.character)


country year pop
<fctr> <chr> <dbl>
Afghanistan	1952	8425333		
Afghanistan	1957	9240934		
Afghanistan	1962	10267083		
Afghanistan	1967	11537966	

5. mutate_all

mutate_all is another useful verb that can used to change every column. In the below example, we change the type of every column to character, regardless of their initial type.

gapminder %>%
  mutate_all(funs(as.character))

country year pop
<chr> <chr> <chr>
Afghanistan	1952	8425333		
Afghanistan	1957	9240934		
Afghanistan	1962	10267083		
Afghanistan	1967	11537966

6. add_column

The latest versions of tibble has a very convenient function called add_column() that helps adding a new column quickly on the fly. The add_column() function will not change the existing data and also one can not overwrite existing columnn.

gapminder %>% 
  add_column(id=1:4)
 
country year pop id
<fctr> <int> <dbl> <int>
1	Afghanistan	1952	8425333	1
2	Afghanistan	1957	9240934	2
3	Afghanistan	1962	10267083	3
4	Afghanistan	1967	11537966	4

The add_column() fucntion also has the arguments before and after. One can use them to specify where the new column should be.

7. add_count

add_count() is a very convenient function that helps quickly count based a variable. For example, if we want to add column specifying the number of country entries for each value of the “country” variable, we can use add_count(country) as shown below. The add_count() function will groub_by each country and get a tally count. This count will be added as new column with name “n”.

gapminder %>% 
  add_count(country)

country  year . pop . n
<fctr> . <int> . <dbl> . <int>
Afghanistan	1952	8425333	4	
Afghanistan	1957	9240934	4	
Afghanistan	1962	10267083	4	
Afghanistan	1967	11537966	4	

8. add_tally

The function add_tally() adds a column n to a table based on the number of items within each existing group.

gapminder %>% 
  add_tally()
country  year . pop . n
<fctr> . <int> . <dbl> . <int>
Afghanistan	1952	8425333	4	
Afghanistan	1957	9240934	4	
Afghanistan	1962	10267083	4	
Afghanistan	1967	11537966	4

9. rename

Often we would like rename a column. This is not adding new column per say, but a old column gets renamed. The rename function is very handy to make such column name changes.

One specifies the new column as an argument to rename function with the old name as follows. Here “population” is the new name and “pop” is the old column in the data frame.

gapminder %>% 
  rename(population=pop)

country . year . population
<fctr> . <int> . <dbl>
1	Afghanistan	1952	8425333	
2	Afghanistan	1957	9240934	
3	Afghanistan	1962	10267083	
4	Afghanistan	1967	11537966	

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

Default ThumbnailHow To Create a Column Using Condition on Another Column in Pandas? Default Thumbnail3 Ways to Add New Columns to Pandas Dataframe? How To Select Columns in Python Pandas?How To Select One or More Columns in Pandas? Change Column Names and Row Indexes in PandasHow To Change Column Names and Row Indexes in Pandas?

Filed Under: Create New Column with Tidyverse, R Tips, tidyverse 101 Tagged With: add_column, Create New Column with Tidyverse, mutate_all, mutate_at, rename tidyverse, tidyverse, transmute

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version