Often one may want to join two text columns into a new column in a data frame. For example, one may want to combine two columns containing last name and first name into a single column with full name.
We can use Pandas’ string manipulation functions to combine two text columns easily.
There are a few ways to combine two columns in Pandas. First we will see an example using cat function.
Let us first create a simple Pandas data frame using Pandas’ DataFrame function.
# import Pandas as pd import pandas as pd # create a new data frame df = pd.DataFrame({'Last': ['Smith', 'Nadal', 'Federer'], 'First': ['Steve', 'Joe', 'Roger'], 'Age':[32,34,36]}) df
Here, we made a toy data frame with three columns and last name and first names are in two separate columns.
Age First Last 0 32 Steve Smith 1 34 Joe Nadal 2 36 Roger Federer
How to Join Two Columns in Pandas with cat function
Let us use Python str function on first name and chain it with cat method and provide the last name as argument to cat function.
df['Name'] = df['First'].str.cat(df['Last'],sep=" ") df
Now we have created a new column combining the first and last names.
Age First Last Name 0 32 Steve Smith Steve Smith 1 34 Joe Nadal Joe Nadal 2 36 Roger Federer Roger Federer
How to Combine Two Columns in Pandas with + operator
Another way to join two columns in Pandas is to simply use the + symbol. For example, to concatenate First Name column and Last Name column, we can do
df["Name"] = df["First"] + df["Last"]
We will get our results like this.
Last First Age Name 0 Smith Steve 32 SteveSmith 1 Nadal Joe 34 JoeNadal 2 Federer Roger 36 RogerFederer
Note that there is no space between first and last name. To add any delimiter, we do
df["Name"] = df["First"] +" "+ df["Last"]
Now we get the Name column with the delimiter between first and last name as we wanted.
Last First Age Name 0 Smith Steve 32 Steve Smith 1 Nadal Joe 34 Joe Nadal 2 Federer Roger 36 Roger Federer