Difference between Pandas where() function and mask() function

Pandas mask() and where() functions are two related functions that are useful in Pandas to find if elements of Pandas dataframe satisfy a condition. They both preserve the shape of the dataframe. In this post, we will first see simple examples of using Pandas where() and mask() functions and then we will learn the key difference between Pandas mask() and where() function.

Let us first load Pandas and Numpy.

import pandas as pd
import numpy as np

Pandas where() function example

Pandas where() function takes in a condition as input and replace values where the condition is False. Pandas where() function syntax is as follows.

DataFrame.where(cond, other=nan, 
               inplace=False)

For illustrating how Pandas where() and mask() functions work let us create a toy dataframe with two columns.

df = pd.DataFrame(np.arange(10).reshape(-1, 2), 
                  columns=['A', 'B'])
df

	A	B
0	0	1
1	2	3
2	4	5
3	6	7
4	8	9

If we simply provide a condition to test, Pandas where() replaces values with NAs whenever the condition fails. In our example, the values of the first three rows less than or equal to 5 and the rest are greater than 5. If we use condition df >5, Pandas where() function replaces the values in the first three rows to NAs as they fail the condition.

df.where(df > 5)

A	B
0	NaN	NaN
1	NaN	NaN
2	NaN	NaN
3	6.0	7.0
4	8.0	9.0

By using “other” argument, we can replace the values with the value provided by “other”

df.where(df > 5, other=999)

	A	B
0	999	999
1	999	999
2	999	999
3	6	7
4	8	9

Pandas mask() function example

Pandas mask() function is the inverse boolean operation of where() function. Pandas mask() function takes a condition as input and replace values in the data, like Pandas where() function. However, mask() replaces values where the condition is True, in contrast to wherever it is False by where().

The toy example figure show the difference between Pandas where() function and mask() fucntion.

Difference between Pandas mask() and where()
Difference between Pandas mask() and where()

In the simple use case, Pandas mask() arguments very similar to where() and looks like this

DataFrame.mask(cond, other=nan, 
               inplace=False)

Pandas mask() function can take a condition, a value to replace by using “other” argument. By default, other value is a missing value where the condition is True. In our example, the values in the last two rows are greater than 5, so they are replaced with NAs.

df.mask(df > 5)

	A	B
0	0.0	1.0
1	2.0	3.0
2	4.0	5.0
3	NaN	NaN
4	NaN	NaN

With the other argument, we can specify a value to be replaced by. In the example below we replace the values with 99.

df.mask(df > 5, other=999)

	A	B
0	0	1
1	2	3
2	4	5
3	999	999
4	999	999