In this post we will see how to get the column names of a Pandas dataframe as a list. One of the common tasks in data analysis is to use the names of columns frequently for a number of reasons.
We will first see how to extract the names of columns from a dataframe. We will use Pandas columns function get the names of the columns. Pandas returns the names of columns as Pandas Index object. It is the basic object storing axis labels. However, having the column names as a list is useful in many situation.
Let us first load Pandas.
# load pandas import pandas as pd
And we will use College tuition data from tidytuesday project illustrate extracting column names as a list. Let us load the dataset directly from tidytuesday project’s github page.
data_url="https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-10/tuition_cost.csv" df = pd.read_csv(data_url) df.iloc[0:3,0:5] name state state_code type degree_length 0 Aaniiih Nakoda College Montana MT Public 2 Year 1 Abilene Christian University Texas TX Private 4 Year 2 Abraham Baldwin Agricultural College Georgia GA Public 2 Year
Pandas Dataframe column names to Python List
We can get the names of columns of Pandas dataframe using Pandas method “columns”.
# Extract Column Names of a Pandas Dataframe df.columns
Pandas’ columns method returns the names as Pandas Index object.
Index(['name', 'state', 'state_code', 'type', 'degree_length', 'room_and_board', 'in_state_tuition', 'in_state_total', 'out_of_state_tuition', 'out_of_state_total'], dtype='object')
We can convert the Pandas Index object to list using the tolist() method.
# Extract Column Names as List in Pandas Dataframe df.columns.tolist()
And now we have Pandas’ dataframe column names as a list printed below.
['name', 'state', 'state_code', 'type', 'degree_length', 'room_and_board', 'in_state_tuition', 'in_state_total', 'out_of_state_tuition', 'out_of_state_total']
Another way to get column names of Pandas dataframe as a list in Python is to first convert the Pandas Index object as NumPy Array using the method “values” and convert to list as shown below.
df.columns.values.tolist()
And we would get the Pandas column names as a list.
['name', 'state', 'state_code', 'type', 'degree_length', 'room_and_board', 'in_state_tuition', 'in_state_total', 'out_of_state_tuition', 'out_of_state_total']
This post is part of the series on Pandas 101, a tutorial covering tips and tricks on using Pandas for data munging and analysis.