How to Split a Single Column in Pandas into Multiple Columns

Often you may have a column in your pandas data frame and you may want to split the column and make it into two columns in the data frame. For example, one of the columns in your data frame is full name and you may want to split into first name and last name (like the figure shown below).

Split a Text Column in Pandas
How To Split a Text Column in Pandas?

We can use Pandas’ string manipulation functions to do that easily. Let us first create a simple Pandas data frame using Pandas’ DataFrame function.

# import Pandas as pd
import pandas as pd
# create a new data frame
df = pd.DataFrame({'Name': ['Steve Smith', 'Joe Nadal',
                            'Roger Federer'],
                 'Age':[32, 34, 36]})
df

Splitting the Original DataFrame’s Single Column into Multiple Columns

We can use Pandas’ str.split function to split the column of interest. Here we want to split the column “Name” and we can select the column using chain operation and split the column with expand=True option.

str.split() with expand=True option results in a data frame and without that we will get Pandas Series object as output.

df.Name.str.split(expand=True,)
          0	 1
0	Steve	Smith
1	Joe	Nadal
2	Roger	Federer

If we want to have the results in the original dataframe with specific names, we can add as new columns like shown below.

df[['First','Last']] = df.Name.str.split(" ",expand=True,)
df

And we will get two new columns in addition to the original data frame.

Age	Name	First	Last
0	32	Steve Smith	Steve	Smith
1	34	Joe Nadal	Joe	Nadal
2	36	Roger Federer	Roger	Federer

Note that we applied str.split method without specifying any specific delimiter. By default, str.split uses a single space as delimiter and we can specify a delimiter as follows. For example, if the text in our column were separated by under score,

df = pd.DataFrame({'Name': ['Steve_Smith', 'Joe_Nadal', 
                           'Roger_Federer'],
                 'Age':[32,34,36]})
df
	Age	Name
0	32	Steve_Smith
1	34	Joe_Nadal
2	36	Roger_Federer

we can use under score as our delimiter to split the column into two columns.

df[['First','Last']] = df.Name.str.split("_",expand=True,)
df
	Age	Name	First	Last
0	32	Steve_Smith	Steve	Smith
1	34	Joe_Nadal	Joe	Nadal
2	36	Roger_Federer	Roger	Federer