• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / Python / Pandas case_when() with multiple examples

Pandas case_when() with multiple examples

February 12, 2024 by cmdlinetips

The newest Pandas release Pandas 2.2.0 has one of the most useful functions case_when() available on a Pandas Series object. Often you might want to create a new variable from an existing variable using multiple conditions. For a simple binary condition we can use Pandas’ where() function. With the new case_when() function we can apply complex conditions to create a new variable. In this post, we will multiple of examples of how to use Pandas case_when() function.

Let us load Pandas and Numpy for creating some toy data.

import pandas as pd
import numpy as np

Until now there was no equivalent to widely useful SQL case_when() function in the pandas library. With Pandas version 2.2.0 we have case_when() function in Pandas. Let us check the version of the installed Pandas.

pd.<strong>version</strong>
2.2.0

If the Pandas version is less than 2.2.0, you can install Pandas version 2.2.0 using pip by specifying the version we want to install.

pip install pandas==2.2.0

To get started on how to use Pandas case_when() fucntion, let us create a simple Pandas Series with 5 elements.

scores = pd.Series(np.random.randint(10,100,5))
scores

0    29
1    45
2    90
3    69
4    68
dtype: int64

Pandas case_when() syntax

Pandas case_when() function takes one argument “caselist”. caselist expects a list of tuples of conditions and expected replacement of the form (condition0, replacement0), (condition1, replacement1). Here condition is a boolean variable.

Pandas case_when() simple example

Let us start with a simple example using one condition. Here we check for the condition if the score value is greater than or equal to 35 and provide the replacement. We can see that the new Series we get as a result has the replacement whenever the condition is satisfied. And it has left the original value when the condition is not met.

scores.case_when(caselist=[(scores >= 35, "pass")])

0      29
1    pass
2    pass
3    pass
4    pass
dtype: object

Pandas case_when() with two conditions

In the example below we specific two conditions and their replacements as argument to Pandas case_when() function.

scores.case_when(caselist=[(scores >= 35, "pass"),
                           (scores < 35, "fail")
                          ])
0    fail
1    pass
2    pass
3    pass
4    pass
dtype: object

Create a New column based on existing column using Pandas case_when() in a dataframe

Pandas case_when() is extremely useful when you want to create a new column in a dataframe based on the values of existing column using multiple conditions.

First, let us create a data frame with one column.

df = pd.DataFrame({"scores": scores})
df

scores
0   29
1   45
2   90
3   69
4   68

We can create a new column that assigns binary grades based on the values of scores column.

df['grade'] = df.scores.case_when(caselist=[(scores &gt;= 35, &quot;pass&quot;),
                                           (scores &lt; 35, &quot;fail&quot;)])
df

<pre><code> scores grade
</code></pre>

0   29  fail
1   45  pass
2   90  pass
3   69  pass
4   68  pass

In the example below we use multiple complex conditions to create a new column with multiple levels of grades defined based on the values of the scores column.

df['grade'] = df.scores.case_when(caselist=[(((scores &gt;= 35) &amp; (scores &lt; 50)), &quot;C&quot;),
                                            (((scores &gt;= 50) &amp; (scores &lt; 80)), &quot;B&quot;),
                                            (scores &gt;= 80, &quot;A&quot;),
                                            (scores &lt; 35, &quot;D&quot;)])
df

<pre><code>scores  grade
</code></pre>

0   29  D
1   45  C
2   90  A
3   69  B
4   68  B

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

Default Thumbnail4 Tidyverse Tips for Future Self: case_when(), fct_relevel(), fct_recode(), scale_fill_brewer() Pandas Change Multiple Columns Values with mapPandas map: Change Multiple Column Values with a Dictionary Default ThumbnailHow to Filter Rows Based on Column Values with query function in Pandas? Default ThumbnailPandas query(): How to Filter Rows of Pandas Dataframe?

Filed Under: Pandas 101, Python, Python Tips Tagged With: Pandas case_when()

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version