In this tutorial, we will see an example of creating Pandas’ dataframe from multiple lists using Pandas’ DataFrame() function..
Let us load Pandas and check its version.
import pandas as pd pd.__version__ 1.0.0
Create two lists
Let us create two lists and use them to create dataframe.
# Create two lists in Python education = ["Bachelor's", "Less than Bachelor's", "Master's","PhD","Professional"] salary = [110000,105000,126000,144200,96000]
Create a dictionary from two lists
We will create a dictionary using the two lists as values and the variable names we want as columns of dataframe as keys.
# create a dictionary using lists a_dict = {"Education":education, "Salary":salary}
Create a data frame from dictionary
We can use the dictionary as argument to Pandas’ DataFrame() and create Pandas dataframe.
# Create a data frame using the dictionary df = pd.DataFrame(a_dict) df Education Salary 0 Bachelor's 110000 1 Less than Bachelor's 105000 2 Master's 126000 3 PhD 144200 4 Professional 96000
Create a data frame from lists in one step
In the above example, we created a dataframe in Pandas in two steps; create dictionary first and use it to create a dataframe.
Here we combine those two steps and creating dataframe by creating dictionary on the fly as an argument.
# Create dataframe in one step df = pd.DataFrame({"Education":education, "Salary":salary})
And we get the same dataframe.
df Education Salary 0 Bachelor's 110000 1 Less than Bachelor's 105000 2 Master's 126000 3 PhD 144200 4 Professional 95967
This post is part of the series on Pandas 101, a tutorial covering tips and tricks on using Pandas for data munging and analysis.
[…] will generate some data using NumPy’s random module and store it in a Pandas dataframe. Unlike before, here we create a Pandas dataframe using two-dimensional NumPy array of size 8×3 and specify […]