Panads explode() function is one of the coolest functions to help split a list like column elements into separate rows. Often while working with real data you might have a column where each element can be list-like. By list-like, we mean it is of the form that can be easily converted into a list.
import pandas as pd
Let us create a toy data frame containing data science books in Python and R. It has two columns, one column is programming language and the other column is name of books.
df = pd.DataFrame({'Book' : ['R for Data Science, Data Visualization' , "Python for Data Analysis, Python Machine Learning"], 'Language' : ["R", "Python"]})
You can see that each row in book column has names of multiple book names separated by a delimiter, comma in this example.
df Book Language 0 R for Data Science, Data Visualization R 1 Python for Data Analysis, Python Machine Learning Python
Create Column with Lists
In this toy data set the Book column is list-like as it can be easily converted to a list. We can convert the column with elements separated by a delimiter into a list of strings using str.split() function. We use Pandas’ assign() function to assign the list to a variable with the same name and create a list column.
df.assign(Book=df.Book.str.split(",")) Book Language 0 [R for Data Science, Data Visualization] R 1 [Python for Data Analysis, Python Machine Lea... Python
Pandas explode() to separate list elements into separate rows()
Now that we have column with list as elements, we can use Pandas explode() function on it. Pandas explode() function will split the list by each element and create a new row for each of them. In this example, each book name goes to a separate row and it also copies other columns for each element.
In this example, we get separate row for each book and also corresponding Language element.
df.assign(Book=df.Book.str.split(",")).explode('Book')
Book Language 0 R for Data Science R 0 Data Visualization R 1 Python for Data Analysis Python 1 Python Machine Learning Python