• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / Python / NumPy / NumPy where / How To Create a Column Using Condition on Another Column in Pandas?

How To Create a Column Using Condition on Another Column in Pandas?

May 19, 2019 by cmdlinetips

Often while cleaning data, one might want to create a new variable or column based on the values of another column using conditions.

In this post we will see two different ways to create a column based on values of another column using conditional statements.

First we will use NumPy’s little unknown function where to create a column in Pandas using If condition on another column’s values. Next we will use Pandas’ apply function to do the same.

Let us first load Pandas and NumPy.

import pandas as pd
import numpy as np

Let us use gapminder dataset from Carpentries for this examples.

data_url = 'http://bit.ly/2cLzoxH'
gapminder = pd.read_csv(data_url)
print(gapminder.head(n=3))
       country  year         pop continent  lifeExp   gdpPercap
0  Afghanistan  1952   8425333.0      Asia   28.801  779.445314
1  Afghanistan  1957   9240934.0      Asia   30.332  820.853030
2  Afghanistan  1962  10267083.0      Asia   31.997  853.100710

How to Create a Column Using A Condition in Pandas using NumPy?

Let us use the lifeExp column to create another column such that the new column will have True if the lifeExp >= 50 False otherwise.

We will use NumPy’s where function on the lifeExp column to create the new Boolean column.

# Create a new column called based on the value of another column
# np.where assigns True if gapminder.lifeExp>=50 
gapminder['lifeExp_ind'] = np.where(gapminder.lifeExp >= 50, True, False)
gapminder.head(n=3)

We can see that we have new column “lifeExp_ind” with True or False.


country	year	pop	continent	lifeExp	gdpPercap	lifeExp_ind
0	Afghanistan	1952	8425333.0	Asia	28.801	779.445314	False
1	Afghanistan	1957	9240934.0	Asia	30.332	820.853030	False
2	Afghanistan	1962	10267083.0	Asia	31.997	853.100710	False

How to Create a Column Using A Condition in Pandas using apply and Lambda functions

Actually we don’t have to rely on NumPy to create new column using condition on another column. Instead we can use Panda’s apply function with lambda function.

gapminder['gdpPercap_ind'] = gapminder.gdpPercap.apply(lambda x: 1 if x >= 1000 else 0)
gapminder.head()
country	year	pop	continent	lifeExp	gdpPercap	lifeExp_ind	gdpPercap_ind
0	Afghanistan	1952	8425333.0	Asia	28.801	779.445314	False	0
1	Afghanistan	1957	9240934.0	Asia	30.332	820.853030	False	0
2	Afghanistan	1962	10267083.0	Asia	31.997	853.100710	False	0

Similarly, we can create complex conditionals. In this example, we check of the variable is in a list and use if condition if present.

gapminder['continent_group'] = gapminder.continent.apply(lambda x: 1 if x in ['Europe','America', 'Oceania'] else 0)
gapminder.head(n=3)


country	year	pop	continent	lifeExp	gdpPercap	lifeExp_ind	gdpPercap_ind	continent_group
0	Afghanistan	1952	8425333.0	Asia	28.801	779.445314	False	0	0
1	Afghanistan	1957	9240934.0	Asia	30.332	820.853030	False	0	0
2	Afghanistan	1962	10267083.0	Asia	31.997	853.100710	False	0	0

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

Default ThumbnailHow to Add Group-Level Summary Statistic as a New Column in Pandas? Default ThumbnailHow to Drop Rows Based on a Column Value in Pandas Dataframe? How To Select Columns in Python Pandas?How To Select One or More Columns in Pandas? Default ThumbnailHow to Create a Pandas Dataframe from Lists

Filed Under: NumPy where, Pandas apply, Pandas New Column Tagged With: NumPy where, Pandas New Column, Pandas New Column Conditional

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version