• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / Python / Pandas DataFrame / Add New Columns Pandas / 3 Ways to Add New Columns to Pandas Dataframe?

3 Ways to Add New Columns to Pandas Dataframe?

January 26, 2019 by cmdlinetips

How To Add New Column in Pandas?
How To Add New Column in Pandas?
While doing data wrangling or data manipulation, often one may want to add a new column or variable to an existing Pandas dataframe without changing anything else. Obviously the new column will have have the same number of elements.

Let us see examples of three ways to add new columns to a Pandas data frame.

Let us first load pandas library

import pandas as pd

Let us use gapminder data set to add new column or new variable in our examples. We will use gapminder data from Software Carpentry website given as data_url below.

data_url = 'http://bit.ly/2cLzoxH'
# load the gapminder dataframe from web as data frame
gapminder = pd.read_csv(data_url)
# select four columns
gapminder = gapminder[['country','year', 'gdpPercap', 'pop']]
# view few elements of the data frame
print(gapminder.head(3))
       country  year   gdpPercap         pop
0  Afghanistan  1952  779.445314   8425333.0
1  Afghanistan  1957  820.853030   9240934.0
2  Afghanistan  1962  853.100710  10267083.0

How To Add New Column to Pandas Dataframe by Indexing: Example 1

Let us say we want to create a new column from an existing column in the data frame. We can create a new column by indexing, using square bracket notation like we do to access the existing element.

For example, we can create a new column with population values in millions in addition to the original variable as

# add new column using square bracket notation
gapminder['pop_in_millions'] = gapminder['pop']/1e06

       country  year   gdpPercap         pop  pop_in_millions
0  Afghanistan  1952  779.445314   8425333.0         8.425333
1  Afghanistan  1957  820.853030   9240934.0         9.240934
2  Afghanistan  1962  853.100710  10267083.0        10.267083

How To Add New Column to Pandas Dataframe using loc: Example 2

Another way to add a new column to a dataframe is to use “loc” function. Here we specify the new column variable and its values.

 
gapminder.loc[:,'pop_in_millions'] = gapminder['pop']/1e06
gapminder.head(3)

       country  year   gdpPercap         pop  pop_in_millions
0  Afghanistan  1952  779.445314   8425333.0         8.425333
1  Afghanistan  1957  820.853030   9240934.0         9.240934
2  Afghanistan  1962  853.100710  10267083.0        10.267083

How To Add New Column to Pandas Dataframe using assign: Example 3

Inspired by dplyr’s mutate function in R to add new variable, Pandas’ recent versions have new function “assign” to add new columns. We can simply chain “assign” to the data frame.

 
gapminder.assign(pop_in_millions=gapminder['pop']/1e06).head(3) 

country	year	gdpPercap	pop	pop_in_millions
0	Afghanistan	1952	779.445314	8425333.0	8.425333
1	Afghanistan	1957	820.853030	9240934.0	9.240934
2	Afghanistan	1962	853.100710	10267083.0	10.267083

It returns a copy of the data frame as a new object with the new columns added to the original data frame. Remember that if you use the names of existing column, then it will be over-written.

With assign function, we can also use a function to add a new column. Here we use a lambda function to create nthe new column with population in millions.

gapminder.assign(pop_in_millions=lambda x: x['pop']/1e06).head()

With Python 3.6+, now one can create multiple new columns using the same assign statement so that one of the new columns uses another newly created column within the same assign statement.

For example, we can create two new variables such that the second new variable uses the first new column as shown below.

gapminder.assign(pop_in_millions=lambda x: x['pop']/1e6,
                pop_in_billions=lambda x: x['pop_in_millions']/1e3).head()


Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

Default ThumbnailHow To Get Data Types of Columns in Pandas Dataframe? How to Change Order of Columns in PandasHow to Change the Order of Columns in a Pandas Dataframe Default Thumbnail6 ways to Sort Pandas Dataframe: Pandas Tutorial How To Select Columns in Python Pandas?How To Select One or More Columns in Pandas?

Filed Under: Add New Columns Pandas, Pandas DataFrame Tagged With: Add New Column in Pandas, Pandas Data Frame

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version