Pandas make it easy to delete rows of a dataframe. There are multiple way to delete rows or select rows from a dataframe. In this post, we will see how to use drop() function to drop rows in Pandas by index names or index location..
Pandas drop() function can also be used drop or delete columns from Pandas dataframe. Therefore, to drop rows from a Pandas dataframe, we need to specify the row indexes that need to be dropped with axis=0 or axis=”index” argument. Here, axis=0 or axis=”index” argument specifies we want to drop rows instead of dropping columns.
Let us load Pandas and Seaborn load Penguin data set to illustrate how to delete one or more rows from the dataframe.
import seaborn as sns import pandas as pd
We will be using just a few rows from the penguins data.
df = (sns.load_dataset("penguins"). head())
Here is our toy data for learning how to delete rows by using index name. Note that indices of the toy dataframe is numeric.
df species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 0 Adelie Torgersen 39.1 18.7 181.0 3750.0 Male 1 Adelie Torgersen 39.5 17.4 186.0 3800.0 Female 2 Adelie Torgersen 40.3 18.0 195.0 3250.0 Female 3 Adelie Torgersen NaN NaN NaN NaN NaN 4 Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
Let us change the index to contain some text instead of numbers in order.
# assign index names to dataframe df.index=["one","two","three","four","five"]
We can see the index is not numbers.
df species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex one Adelie Torgersen 39.1 18.7 181.0 3750.0 Male two Adelie Torgersen 39.5 17.4 186.0 3800.0 Female three Adelie Torgersen 40.3 18.0 195.0 3250.0 Female four Adelie Torgersen NaN NaN NaN NaN NaN five Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
How to Drop one row by index name?
To delete a row from a dataframe, we specify the index name and also use “axis=0” argument. In this example, we drop row with name “one”.
df.drop("one",axis=0) species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex two Adelie Torgersen 39.5 17.4 186.0 3800.0 Female three Adelie Torgersen 40.3 18.0 195.0 3250.0 Female four Adelie Torgersen NaN NaN NaN NaN NaN five Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
Another way to specify we want to delete a row not a column is to use axis=”index” argument instead of axis=0. Again, we drop row with name “one”.
df.drop("one",axis="index") species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex two Adelie Torgersen 39.5 17.4 186.0 3800.0 Female three Adelie Torgersen 40.3 18.0 195.0 3250.0 Female four Adelie Torgersen NaN NaN NaN NaN NaN five Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
How to Delete Multiple Rows by index names?
In order to delete multiple rows, we need to specify the index names as a list to Pandas drop() function. In this example, we drop the first two rows by specifying their names in a list.
df.drop(["one","two"],axis="index") species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex three Adelie Torgersen 40.3 18.0 195.0 3250.0 Female four Adelie Torgersen NaN NaN NaN NaN NaN five Adelie Torgersen 36.7 19.3 193.0 3450.0 Female
How to Delete Multiple Rows by their locations?
Sometimes, we might want to delete one or multiple rows by their location instead of their index names. To delete by their location, we can use subsetted index as shown here.
df.drop(df.index[[0,1]]) species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex three Adelie Torgersen 40.3 18.0 195.0 3250.0 Female four Adelie Torgersen NaN NaN NaN NaN NaN five Adelie Torgersen 36.7 19.3 193.0 3450.0 Female