Creating a new column to a dataframe is a common task in doing data analysis. And this task often comes in a variety of forms. Earlier we saw how to add a column using an existing columns in two ways. In this post we will learn how to add a new column using a dictionary in Pandas.
Pandas library in Python has a really cool function called map that lets you manipulate your pandas data frame much easily. Pandas’ map function lets you add a new column with values from a dictionary if the data frame has a column matching the keys in the dictionary.
Adding a New Column Using keys from Dictionary matching a column in pandas
Let us say you have pandas data frame created from two lists as columns; continent and mean_lifeExp.
# derived from gapminder data set # http://bit.ly/2cLzoxH print(gapminder_df)
continent mean_lifeExp 0 Asia 48.86 1 Europe 64.65 2 Africa 60.06 3 Americas 71.90 4 Oceania 74.32
and let us say we also have a dictionary, where the keys are “continent” as in the above data frame and values are mean population over years.
print(pop_dict)
{'Europe': 24504794.99, 'Oceania': 8874672.33, 'Africa': 77038721.97, 'Asia': 9916003.14, 'Americas': 17169764.73}
Note that it is a dictionary, so the order of items do not match the continent column of the data frame.
Map function to Add a New Column to pandas with Dictionary
Let us say we want to add a new column ‘pop’ in the pandas data frame with values from the dictionary. Note the keys of the dictionary are “continents” and the column “continent” in the data frame. Pandas’ map function is here to add a new column in pandas dataframe using the keys:values from the dictionary.
gapminder_df['pop']= gapminder_df['continent'].map(pop_dict)
Voila!! here is the updated data frame with a new column from the dictionary.
continent mean_lifExp pop 0 Asia 48.86 9916003.14 1 Europe 64.65 24504794.99 2 Africa 60.06 77038721.97 3 Americas 71.90 17169764.73 4 Oceania 74.32 8874672.33
Another common use of dictionary to add a new column in Pandas is to code an exisiting column using dictionary and create a new column. For example, let us consider the gapminder data frame
data_url = 'http://bit.ly/2cLzoxH' # read data from url as pandas dataframe gapminder = pd.read_csv(data_url) gapminder = gapminder[['continent','gdpPercap', 'lifeExp']] print(gapminder.head(3))