How To Reset Index in Pandas Dataframe?


In this post, we will learn how to reset index in Pandas dataframe starting from zero. We will use pandas reset_index() function to reset index of a dataframe.

Often you start with a big dataframe in Pandas and after manipulating and filtering the data frame you will end up with much smaller data frame.

When you look at the smaller dataframe, it might still carry the row index of the original dataframe. If the original row index are numbers, now you will have indexes that are not continuous starting from 0 to one less than number of rows. You might want to reset the dataframe’s index to zero to the small dataframe. And pandas reset_index is here to help us.

How reset index in Pandas dataframe?
How reset index in Pandas dataframe?

Let us load Pandas.

import pandas as pd

Let us use the gapminder data from Software Carpentry website and load it as Pandas dataframe. The gapminder data frame has over 1700 rows corresponding countries around the world and 6 columns.

gapminder_url='https://bit.ly/2cLzoxH'
gapminder = pd.read_csv(gapminder_url)
gapminder.head()

Let us do some dataframe manipulation to get a smaller dataframe. Let us first drop a few columns just for ease of visualizing the output dataframe.

>gapminder = gapminder.drop(['pop','gdpPercap'],axis=1)
>print(gapminder.shape)
(1704, 4)

Now our dataframe will have just 4 columns and all the rows. Let us do some filtering and select rows containing countries from Oceania continent and for the years greater than 2000.

gapminder_ocean = gapminder[(gapminder.year > 2000) & 
                            (gapminder.continent == 'Oceania')]
gapminder_ocean.shape
(4, 4)

After filtering we have a dataframe with just 4 rows corresponding to two countries in Oceania continent. Also note that the row index of the dataframe is 70,71, 1102, and 1103. These were original row index of these rows.

print(gapminder_ocean)
          country  year continent  lifeExp
70      Australia  2002   Oceania   80.370
71      Australia  2007   Oceania   81.235
1102  New Zealand  2002   Oceania   79.110
1103  New Zealand  2007   Oceania   80.204

pandas reset_index() to reset row index to zero

We can reset the row index in pandas with reset_index() to make the index start from 0. We can call reset_index() on the dataframe and get

gapminder_ocean.reset_index()
	index	country	year	continent	lifeExp
0	70	Australia	2002	Oceania	80.370
1	71	Australia	2007	Oceania	81.235
2	1102	New Zealand	2002	Oceania	79.110
3	1103	New Zealand	2007	Oceania	80.204

Now the row index starts from 0 and also note that pandas reset_index() keeps the original row index as a new column with the name index.

Often you don’t need the extra column with original row index. We can specify pandas to not to keep the original index with the argument drop=True.

gapminder_ocean.reset_index(drop=True)
	country	year	continent	lifeExp
0	Australia	2002	Oceania	80.370
1	Australia	2007	Oceania	81.235
2	New Zealand	2002	Oceania	79.110
3	New Zealand	2007	Oceania	80.204

reset_index() to reset pandas index to zero in-place

If you want to reset index to zero in place, we cal also add the inplace=True argument.

gapminder_ocean.reset_index(drop=True, inplace=True)
gapminder_ocean
	country	year	continent	lifeExp
0	Australia	2002	Oceania	80.370
1	Australia	2007	Oceania	81.235
2	New Zealand	2002	Oceania	79.110
3	New Zealand	2007	Oceania	80.204