How to Replace Multiple Column Values with Dictionary in Python

Pandas Replace Multiple Column Values with Dictionary
Pandas Replace Multiple Column Values with Dictionary

Sometimes you might like to change the content of Pandas dataframe, values in one or more columns (not the names of the columns) with some specific values. Pandas’ replace() function is a versatile function to replace the content of a Pandas data frame. First, we will see how to replace multiple column values in a Pandas dataframe using a dictionary, where the key specifies column values that we want to replace and values in the dictionary specifies what we want as shown in the illustration.

Pandas Replace Multiple Column Values with Dictionary

We will use Pandas’s replace() function to change multiple column’s values at the same time. Let us first load Pandas.

import pandas as pd
# import random 
from random import sample

Let us create some data using sample from random module.

# Create two lists in Python
name_list = ["name1", "name2","name3","name4"]

Using the name list, let us create three variables using sample() function.

cluster1 = sample(name_list,4)
cluster2 = sample(name_list,4)
cluster3 = sample(name_list,4)

Now, we can use these lists to create a dataframe with 3 columns.

df = pd.DataFrame({"cluster1":cluster1,
              "cluster2":cluster2,
              "cluster3":cluster3,
             })
df

Our dataframe look like this.

	cluster1	cluster2	cluster3
0	name1	name1	name4
1	name4	name3	name1
2	name3	name4	name3
3	name2	name2	name2

If we want to create a new data dataframe replace the column values of all columns at the same time, we can use Python dictionary to specify how we want to replace each value. In this example our dataframe with multiple columns are made of four values, name1, name2, name3, and name4. With the dictionary we specify the new values and provide the dictionary as input to the replace() function.

df.replace({"name1":"Symbol1",
            "name2":"Symbol2",
            "name3":"Symbol3",
            "name4":"Symbol4"})

Now we get a new dataframe replacing the values of multiple columns at the same time.

cluster1	cluster2	cluster3
0	Symbol1	Symbol1	Symbol4
1	Symbol4	Symbol3	Symbol1
2	Symbol3	Symbol4	Symbol3
3	Symbol2	Symbol2	Symbol2

We can also create dictionary beforehand and use the dictionary to replace multiple columns’ values with Pandas replace() function.

symbol_list = ["symbol1", "symbol2","symbol3","symbol4"]
n2s = dict(zip(name_list,symbol_list))
n2s
{'name1': 'symbol1',
 'name2': 'symbol2',
 'name3': 'symbol3',
 'name4': 'symbol4'}
df.replace(n2s)
	cluster1	cluster2	cluster3
0	symbol1	symbol1	symbol4
1	symbol4	symbol3	symbol1
2	symbol3	symbol4	symbol3
3	symbol2	symbol2	symbol2

Pandas replace(): How to Replace Values of a Specific Column with a Dictionary?

In the above example, we replaced all column’s values at the same time. With replace() we can also specify a column of interest to change its values.

In the example below, we use dictionary and specify a column of interest to change its values.

df.replace({'cluster1': {"name1": "SYMBOL",
                        "name2":"Symbooooo"}})

Note, we have changed first column’s values using the dictionary. Other column’s values’ remain the same.

	cluster1	cluster2	cluster3
0	SYMBOL	    name1	name4
1	name4	    name3	name1
2	name3	    name4	name3
3	Symbooooo   name2	name2

Pandas replace(): How to Replace a Single Value ?

Instead of a dictionary, we can also change a single value in a dataframe to another value. To do that we specify the value to be replaced and the value we want as shown below.

df.replace("name1", "SYMBOL")

In this example, we have changed every instance of “name1” to “SYMBOL”.

cluster1	cluster2	cluster3
0	SYMBOL	SYMBOL	name4
1	name4	name3	SYMBOL
2	name3	name4	name3
3	name2	name2	name2

Pandas replace(): How to Replace Multiple Values with a Single Value?

Pandas replace() function is versatile. We can also provide multiple values we would like to be replaced using a list. In this example, we replace values in a list to a single value.

df.replace(["name1", "name2","name3"], "SYMBOL")

Here, we have changed every instance of name1, name2, and name3 to “SYMBOL”

cluster1	cluster2	cluster3
0	SYMBOL	SYMBOL	name4
1	name4	SYMBOL	SYMBOL
2	SYMBOL	name4	SYMBOL
3	SYMBOL	SYMBOL	SYMBOL

Want to get better at using Pandas for data science-ing? Check out Byte Sized Pandas 101 tutorials.