Sometimes while working a Pandas dataframe, you might like to subset the dataframe by keeping or drooping other columns. In this post, we will see examples of dropping multiple columns from a Pandas dataframe. We can drop columns in a few ways. We will use Pandas drop() function to learn to drop multiple columns and get a smaller Pandas dataframe.
Let us load Pandas. We also import numpy to generate data for our toy dataframe.
# load pandas import pandas as pd # load numpy import numpy as np # set seed for reproducing the data np.random.seed(42)
We create a toy Pandas dataframe using NumPy’s random module with index and column names.
df =pd.DataFrame(np.random.randint(20, size=(8,4)), index=list('ijklmnop'), columns=list('ABCD'))
Our dataframe has 4 columns with names A,B,C,and D. And it has 8 rows.
df.head() A B C D i 6 19 14 10 j 7 6 18 10 k 10 3 7 2 l 1 11 5 1 m 0 11 11 16
Drop Multiple Columns using Pandas drop() with axis=1
We can use Pandas drop() function to drop multiple columns from a dataframe. Pandas drop() is versatile and it can be used to drop rows of a dataframe as well.
To use Pandas drop() function to drop columns, we provide the multiple columns that need to be dropped as a list. In addition, we also need to specify axis=1 argument to tell the drop() function that we are dropping columns. With axis=0 drop() function drops rows of a dataframe.
df.drop(['A', 'B'], axis=1) C D i 14 10 j 18 10 k 7 2 l 5 1 m 11 16 n 14 14 o 19 2 p 6 8
Drop Multiple Columns using Pandas drop() with columns
We can also use Pandas drop() function without using axis=1 argument. However, we need to specify the argument “columns” with the list of column names to be dropped.
For example, to drop columns A and B, we need to specify “columns=[‘A’, ‘B’]” as drop() function’s argument. And this would drop the two columns and get the same results as before.
df.drop(columns=['A', 'B']) C D i 14 10 j 18 10 k 7 2 l 5 1 m 11 16 n 14 14 o 19 2 p 6 8
How To Drop Multiple Columns inplace in Pandas?
We can also use Pandas drop() function to drop multiple columns in place. This basically changes the original dataframe. To drop columns without creating a new dataframe we specify “inplace=True”.
df.drop(columns=['A', 'B'], inplace=True) df C D i 14 10 j 18 10 k 7 2 l 5 1 m 11 16 n 14 14 o 19 2 p 6 8
How To Drop Multiple Columns by selecting columns?
Another way to drop certain columns is to select the remaining columns using Pandas [[]] operation. For example, to drop the columns A and B, we would select the remaining columns, in this case C and D. And we would get the same results as before.
df[['C','D']] C D i 14 10 j 18 10 k 7 2 l 5 1 m 11 16 n 14 14 o 19 2 p 6 8
This blog post is part of the series on Byte Sized Pandas: Pandas 101, a tutorial covering tips and tricks on using Pandas for data munging and analysis.