How to Convert a Column to Datetime type with Pandas

Pandas in Python has numerous functionalities to deal with time series data. One of the simplest tasks in data analysis is to convert date variable that is stored as string type or common object type in in Pandas dataframe to a datetime type variable.

In this post we will see two ways to convert a Pandas column to a datetime type using Pandas.

First, let us load Pandas.

import pandas as pd

Here, we will use sample data and load it directly from cmdlinetips.com‘s github page.

path2data = "https://raw.githubusercontent.com/cmdlinetips/data/master/sample_data_to_convert_column_to_datetime_pandas.csv"
#path2data="https://bit.ly/2ZXQ7s9"
df = pd.read_csv(path2data)

We can see that the first column looks like date variable with YY-MM-DD format.

df.head()

date	precipitation	temp_max	temp_min	wind	weather
0	2012-01-01	0.0	12.8	5.0	4.7	drizzle
1	2012-01-02	10.9	10.6	2.8	4.5	rain
2	2012-01-03	0.8	11.7	7.2	2.3	rain
3	2012-01-04	20.3	12.2	5.6	4.7	rain
4	2012-01-05	1.3	8.9	2.8	6.1	rain

However, when we check the types of the columns using Pandas’ info() function, we can see that the date column is of type “object”. And we would like to change the column to datetime type.

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   date           100 non-null    object   <====
 1   precipitation  100 non-null    float64
 2   temp_max       100 non-null    float64
 3   temp_min       100 non-null    float64
 4   wind           100 non-null    float64
 5   weather        100 non-null    object 
dtypes: float64(4), object(2)
memory usage: 4.8+ KB

Convert a Column to datetime type while loading the file with read_csv()

One of the ways to convert one or more columns in a data frame, is to specify the variable or column name to be loaded as datetime variable while loading the file using Pandas’ read_csv().

Pandas’ read_csv() function has “parse_dates” argument and it takes the list of column names that we want to load as datetime variable. In this example, we have just column of interest and specify its name.

df = pd.read_csv("sample_data_to_convert_column_to_datetime_pandas.csv",
                 parse_dates=['date'])

We can use info() function and see that now our column is of datetime type.

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   date           100 non-null    datetime64[ns] <===
 1   precipitation  100 non-null    float64       
 2   temp_max       100 non-null    float64       
 3   temp_min       100 non-null    float64       
 4   wind           100 non-null    float64       
 5   weather        100 non-null    object        
dtypes: datetime64[ns](1), float64(4), object(1)
memory usage: 4.8+ KB

One of the biggest advantages of specifying the column to be datetime variable while loading the file is that we can convert multiple columns if needed.

Convert a Column to datetime with Pandas’ to_datetime()

Another option to convert a column to date type is converting it after loading the data as dataframe. Pandas’ to_datetime() function can take a Series object and convert to datetime format.

In this example, we take the column of interest and give it as input to to_datetime() function and then reassign to the same column variable in the dataframe.

df['date']=pd.to_datetime(df['date'],
                                   infer_datetime_format=True)
df.info()

We can see that now our column is of datetime type.

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype         
---  ------         --------------  -----         
 0   date           100 non-null    datetime64[ns]. <====
 1   precipitation  100 non-null    float64       
 2   temp_max       100 non-null    float64       
 3   temp_min       100 non-null    float64       
 4   wind           100 non-null    float64       
 5   weather        100 non-null    object        
dtypes: datetime64[ns](1), float64(4), object(1)
memory usage: 4.8+ KB