How to Change Type for One or More Columns in Pandas Dataframe?

Sometimes when you create a data frame, some of the columns may be of mixed type. And you might see warning like this

DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.

We get this error when Pandas tries to guess the type for each element of a column.

For example, let us say you have a file “weather.tsv” like

      Day	Temp	Wind
	1	96	7
	day2	94	8
	3	65	25
	4	80	10

Where the column “Day” has mixed data types; numbers and string, you will see the above error when you load the file as a data frame using Pandas.

How To Find Data Types of All columns?

We can check data types of all the columns in a data frame with “dtypes”.

df.dtypes

For example, after loading a file as data frame you will see

Day      object
Temp    float64
Wind      int64
dtype: object

How To Change Data Types of a single Column?

There are a few ways to change the datatype of a variable or a column. If you want to change the datatype of just one variable or one column, we can use “astype”. To change the data type the column “Day” to str, we can use “astype” as follows

df.Day = df.Day.astype(str)

You will see the results as

df.dtypes
Day      object
Temp    float64
Wind      int64
dtype: object

How To Change Data Types of One or More Columns?

There is a better way to change the data type using a mapping dictionary.

Let us say you want to change datatypes of multiple columns of your data and also you know ahead of the time which columns you would like to change.

One can easily specify the data types you want while loading the data as Pandas data frame. For example, if you are reading a file and loading as Pandas data frame, you pre-specify datatypes for multiple columns with a
a mapping dictionary with variable/column names as keys and data type you want as values.

Let us use Pandas read_csv to read a file as data frame and specify a mapping function with two column names as keys and their data types you want as values.


df = pd.read_csv("weather.tsv", sep="\t",  
                 dtype={'Day': str,'Wind':int64})
df.dtypes

You can see the new data types of the data frame

Day      object
Temp    float64
Wind      int64
dtype: object

It is also good practice to specify the data types while loading the data frame.