NumPy is fantastic for numerical data. One can really do powerful operations with numerical data easily and much faster. However, if your data is of mixed type, like some columns are strings while the others are numeric, using data frame with Pandas is the best option.
How to Create Pandas Dataframe from lists?
Let us say we have two lists, one of them is of string type and the other is of type int. We want to make a dataframe with these lists as columns.
>months = ['Jan','Apr','Mar','June'] >days = [31,30,31,30]
We will see three ways to get dataframe from lists.
1. Create pandas dataframe from lists using dictionary
One approach to create pandas dataframe from one or more lists is to create a dictionary first. Let us make a dictionary with two lists such that names as keys and the lists as values.
>d = {'Month':months,'Day':days} >d {'Day': [31, 30, 31, 30], 'Month': ['Jan', 'Apr', 'Mar', 'June']}
Here d is our dictionary with names “Day” and “Month” as keys.
# Load pandas as pd >import pandas as pd
Let us create a pandas dataframe from using pd.DataFrame function with our dictionary as input.
>df = pd.DataFrame(d) >df Day Month 0 31 Jan 1 30 Apr 2 31 Mar 3 30 June
Now we have our pandas dataframe from lists. Notice that the columns of the dataframe is Day first and Month next. Let us say we want Month first and Day next in the dataframe. To specify the order of the columns, we can use “columns” option with pd.DataFrame like
>df = pd.DataFrame(d, columns=['Month','Day']) >df Month Day 0 Jan 31 1 Apr 30 2 Mar 31 3 June 30
2. Create pandas dataframe from lists using zip
Second way to make pandas dataframe from lists is to use the zip function. We can use the zip function to merge these two lists first. In Python 3, zip function creates a zip object, which is a generator and we can use it to produce one item at a time. To get a list of tuples, we can use list() and create a list of tuples. For this example, we can create a list of tuples like
# Python 3 to get list of tuples from two lists data_tuples = list(zip(Month,Days)) data_tuples [('Jan', 31), ('Apr', 30), ('Mar', 31), ('June', 30)]
Note that if you use Python 2, zip(Month,Days) alone is enough to get list of tuples. We don’t need to use list(zip()).
Converting list of tuples to pandas dataframe
We can simply use pd.DataFrame on this list of tuples to get a pandas dataframe. And we can also specify column names with the list of tuples.
>pd.DataFrame(data_tuples, columns=['Month','Day']) Month Day 0 Jan 31 1 Apr 30 2 Mar 31 3 June 30
3. Create pandas dataframe from scratch
The third way to make a pandas dataframe from multiple lists is to start from scratch and add columns manually. We will first create an empty pandas dataframe and then add columns to it.
Create Empty Pandas Dataframe
# create empty data frame in pandas >df = pd.DataFrame()
Add the first column to the empty dataframe.
# add a coumn >df['Month'] = months Month 0 Jan 1 Apr 2 Mar 3 June
Now add the second column.
# add second column >df['Day'] = days Month Day 0 Jan 31 1 Apr 30 2 Mar 31 3 June 30