In this post, we will see how to get data types of variables or columns in a Pandas dataframe.
import pandas as pd pd.__version__ '1.0.0'
Let us use gapminder data from cmdlinetips.com’s github page. We read the file directly from the web using Pandas’ read_csv() function.
data_url = "https://raw.githubusercontent.com/cmdlinetips/data/master/gapminder-FiveYearData.csv" df = pd.read_csv(data_url)
We can see that the gapminder dataframe contains different types of variables.
df.head() country year pop continent lifeExp gdpPercap 0 Afghanistan 1952 8425333.0 Asia 28.801 779.445314 1 Afghanistan 1957 9240934.0 Asia 30.332 820.853030 2 Afghanistan 1962 10267083.0 Asia 31.997 853.100710 3 Afghanistan 1967 11537966.0 Asia 34.020 836.197138 4 Afghanistan 1972 13079460.0 Asia 36.088 739.981106
We can find the name of the datatypes in Pandas using dtypes method in Pandas.
df.dtypes
We see that some variables are of generic type “object” and year variable is of int data type, and pop, lifExp, gdpPercap are of type float64.=
country object year int64 pop float64 continent object lifeExp float64 gdpPercap float64 dtype: object
This post is part of the series on Pandas 101, a tutorial covering tips and tricks on using Pandas for data munging and analysis.