Often while working with a bigger pandas dataframe with multiple columns, one wants to drop a column or multiple columns from a pandas dataframe.
One typically drops columns, if the columns are not needed for further analysis. Pandas drop function allows you to drop/remove one or more columns from a dataframe.
Let us see some examples of dropping or removing columns from a real world data set. Let us load pandas and load gapminder data from a URL.
import pandas as pd gapminder_url='https://bit.ly/2cLzoxH' gapminder = pd.read_csv(gapminder_url) gapminder.head()
Let us do some filtering to make the dataframe smaller just for the ease of illustrating the examples of using drop function in Pandas. After filtering, we will have a smaller dataframe with just four rows and six columns.
gapminder_ocean = gapminder[(gapminder.year >2000) & (gapminder.continent== 'Oceania')] gapminder_ocean.shape gapminder_ocean
How To Drop a Single Column from a Dataframe?
To drop a single column from pandas dataframe, we need to provide the name of the column to be dropped as a list as an argument to drop function. Here, we have a list containing just one element, ‘pop’ variable. Pandas drop function can drop column or row. To specify we want to drop column, we need to provide axis=1 as another argument to drop function.
# pandas drop a column with drop function gapminder_ocean.drop(['pop'], axis=1)
The resulting dataframe will have just five columns instead of six. The column containing pop variable is removed now.
country year continent lifeExp gdpPercap 70 Australia 2002 Oceania 80.370 30687.75473 71 Australia 2007 Oceania 81.235 34435.36744 1102 New Zealand 2002 Oceania 79.110 23189.80135 1103 New Zealand 2007 Oceania 80.204 25185.00911
How To Drop Multiple Columns from a Dataframe?
Pandas’ drop function can be used to drop multiple columns as well. To drop or remove multiple columns, one simply needs to give all the names of columns that we want to drop as a list. Here is an example with dropping three columns from gapminder dataframe.
# pandas drop columns using list of column names gapminder_ocean.drop(['pop', 'gdpPercap', 'continent'], axis=1)
Note that now the resulting data frame contains just three columns instead of six columns.
country year lifeExp 70 Australia 2002 80.370 71 Australia 2007 81.235 1102 New Zealand 2002 79.110 1103 New Zealand 2007 80.204