Sometimes you might like to change the content of Pandas dataframe, values in one or more columns (not the names of the columns) with some specific values. Pandas’ replace() function is a versatile function to replace the content of a Pandas data frame. First, we will see how to replace multiple column values in a Pandas dataframe using a dictionary, where the key specifies column values that we want to replace and values in the dictionary specifies what we want as shown in the illustration.
We will use Pandas’s replace() function to change multiple column’s values at the same time. Let us first load Pandas.
import pandas as pd # import random from random import sample
Let us create some data using sample from random module.
# Create two lists in Python name_list = ["name1", "name2","name3","name4"]
Using the name list, let us create three variables using sample() function.
cluster1 = sample(name_list,4) cluster2 = sample(name_list,4) cluster3 = sample(name_list,4)
Now, we can use these lists to create a dataframe with 3 columns.
df = pd.DataFrame({"cluster1":cluster1, "cluster2":cluster2, "cluster3":cluster3, }) df
Our dataframe look like this.
cluster1 cluster2 cluster3 0 name1 name1 name4 1 name4 name3 name1 2 name3 name4 name3 3 name2 name2 name2
If we want to create a new data dataframe replace the column values of all columns at the same time, we can use Python dictionary to specify how we want to replace each value. In this example our dataframe with multiple columns are made of four values, name1, name2, name3, and name4. With the dictionary we specify the new values and provide the dictionary as input to the replace() function.
df.replace({"name1":"Symbol1", "name2":"Symbol2", "name3":"Symbol3", "name4":"Symbol4"})
Now we get a new dataframe replacing the values of multiple columns at the same time.
cluster1 cluster2 cluster3 0 Symbol1 Symbol1 Symbol4 1 Symbol4 Symbol3 Symbol1 2 Symbol3 Symbol4 Symbol3 3 Symbol2 Symbol2 Symbol2
We can also create dictionary beforehand and use the dictionary to replace multiple columns’ values with Pandas replace() function.
symbol_list = ["symbol1", "symbol2","symbol3","symbol4"] n2s = dict(zip(name_list,symbol_list)) n2s
{'name1': 'symbol1', 'name2': 'symbol2', 'name3': 'symbol3', 'name4': 'symbol4'}
df.replace(n2s)
cluster1 cluster2 cluster3 0 symbol1 symbol1 symbol4 1 symbol4 symbol3 symbol1 2 symbol3 symbol4 symbol3 3 symbol2 symbol2 symbol2
Pandas replace(): How to Replace Values of a Specific Column with a Dictionary?
In the above example, we replaced all column’s values at the same time. With replace() we can also specify a column of interest to change its values.
In the example below, we use dictionary and specify a column of interest to change its values.
df.replace({'cluster1': {"name1": "SYMBOL", "name2":"Symbooooo"}})
Note, we have changed first column’s values using the dictionary. Other column’s values’ remain the same.
cluster1 cluster2 cluster3 0 SYMBOL name1 name4 1 name4 name3 name1 2 name3 name4 name3 3 Symbooooo name2 name2
Pandas replace(): How to Replace a Single Value ?
Instead of a dictionary, we can also change a single value in a dataframe to another value. To do that we specify the value to be replaced and the value we want as shown below.
df.replace("name1", "SYMBOL")
In this example, we have changed every instance of “name1” to “SYMBOL”.
cluster1 cluster2 cluster3 0 SYMBOL SYMBOL name4 1 name4 name3 SYMBOL 2 name3 name4 name3 3 name2 name2 name2
Pandas replace(): How to Replace Multiple Values with a Single Value?
Pandas replace() function is versatile. We can also provide multiple values we would like to be replaced using a list. In this example, we replace values in a list to a single value.
df.replace(["name1", "name2","name3"], "SYMBOL")
Here, we have changed every instance of name1, name2, and name3 to “SYMBOL”
cluster1 cluster2 cluster3 0 SYMBOL SYMBOL name4 1 name4 SYMBOL SYMBOL 2 SYMBOL name4 SYMBOL 3 SYMBOL SYMBOL SYMBOL
Want to get better at using Pandas for data science-ing? Check out Byte Sized Pandas 101 tutorials.
2 comments
Comments are closed.