Sometimes, as part of a quick exploratory data analysis, you may want to make a single plot containing two variables with different scales.
One of the options is to make a single plot with two different y-axis, such that the y-axis on the left is for one variable and the y-axis on the right is for the y-variable.
If you try to plot the two variables on a same plot without having two different y-axis, the plot would not really make sense.
If the variables have very different scales, you’ll want to make sure that you plot them in different twin Axes objects. These objects can share one axis (for example, the time, or x-axis) while not sharing the other (the y-axis).
To create a twin Axes object that shares the x-axis, we use the twinx method.
Let us import Pandas.
# import pandas import pandas as pd
We will use gapminder data from Carpentries to make the plot with two different y-axis on the same plot.
# Carpentries link for gapminder data data_url = 'http://bit.ly/2cLzoxH' #load gapminder data from url as pandas dataframe gapminder = pd.read_csv(data_url) print(gapminder.head(3))
Let us subset gapminder data by using Pandas query() function to filter for rows with United States.
gapminder_us = gapminder[gapminder.country=="United States"]
We are interested in making a plot of how lifeExp & gdpPercap changes over the years. The variable on x-axis is year and on y-axis we are interested in lifeExp & gdpPercap.
Both lifeExp and gdpPercap have different ranges. lifeExp values are below 100 and gdpPercap values are in thousands.
Naively, let us plot both on the same plot with a single y-axis.
# create figure and axis objects with subplots() fig,ax=plt.subplots() ax.plot(gapminder_us.year, gapminder_us.lifeExp, marker="o") ax.set_xlabel("year") ax.set_ylabel("lifeExp") ax.plot(gapminder_us.year, gapminder_us["gdpPercap"], marker="o") plt.show()
We can immediately see that this is a bad idea. The line for lifeExp over years is flat and really low. We don’t see any variation in it because of the scale of gdpPercap values.
One of the solutions is to make the plot with two different y-axes. The way to make a plot with two different y-axis is to use two different axes objects with the help of twinx() function.
We first create figure and axis objects and make a first plot. In this example, we plot year vs lifeExp. And we also set the x and y-axis labels by updating the axis object.
# create figure and axis objects with subplots() fig,ax = plt.subplots() # make a plot ax.plot(gapminder_us.year, gapminder_us.lifeExp, color="red", marker="o") # set x-axis label ax.set_xlabel("year", fontsize = 14) # set y-axis label ax.set_ylabel("lifeExp", color="red", fontsize=14)
Next we use twinx() function to create the second axis object “ax2”. Now we use the second axis object “ax2” to make plot of the second y-axis variable and update their labels.
# twin object for two different y-axis on the sample plot ax2=ax.twinx() # make a plot with different y-axis using second axis object ax2.plot(gapminder_us.year, gapminder_us["gdpPercap"],color="blue",marker="o") ax2.set_ylabel("gdpPercap",color="blue",fontsize=14) plt.show() # save the plot as a file fig.savefig('two_different_y_axis_for_single_python_plot_with_twinx.jpg', format='jpeg', dpi=100, bbox_inches='tight')
Then we can display the plot with plt.show() as before.
Now we have what we wanted. A plot with with different y-axis made with twinx in matplotlib. This definitely help us understand the relationship of the two variables against another. We can see that both lifeExp and gdpPerCap have increased over the years.
Although a plot with two y-axis does help see the pattern, personally I feel this is bit cumbersome. A better solution to use the idea of “small multiples”, two subplots with same x-axis. We will see an example of that soon.