Matplotlib, the most comprehensive visualisation library in Python for creating all kinds of plots of data visualization. However, it can also be a bit frustrating and daunting given so much you can do with Matplotlib.
In this post, we will learn how to use 8 commonly used plot types, like scatter plot, histogram, with real simple examples. Our goal here is not creating publication quality plot, but making basic plots first.
To get started let us load Numpy and Matplotlib. We are using the Matplotlib version 3.6.3
import numpy as np import matplotlib matplotlib.__version__ 3.6.3 import matplotlib.pyplot as plt
1. Plotting with Matplotlib’s plot() function
Matplotlib.pyplot’s plot() function is useful for making line plots between two variables. Line plots are ideal for time-series like plots where we have time on x-axis. Here we generate data using random numbers to make line plot using plot() function.
And in this post we use Random Generator class to generate random numbers.
rng = np.random.default_rng(42)
For x-axis we generate range of integers using Numpy’s arange() function. And for y-axis we create random numbers from uniform distribution. And now we have the data ready to make a line plot using Matplotlib.
X = np.arange(11) Y = X + rng.uniform(0,5,11) print(X) print(Y) [ 0 1 2 3 4 5 6 7 8 9 10] [ 3.86978024 3.1943922 6.2929896 6.48684015 4.47088674 9.87811176 9.80569851 10.93032153 8.64056816 11.25192969 11.85399012]
Matplotlib’s plot function takes the values needed for x and y axis on the plot as arguments.
# lineplot with Matplotlib's plot() function plt.plot(X,Y) # Add x and y axis labels plt.xlabel("X", size=16) plt.ylabel("Y", size=16) # add title plt.title("Matplotlib plot() example", size=20) plt.savefig("lineplot_with_Matplotlib_plot.png", format='png',dpi=150)
And this is how the simple line plot looks like.
2. Histogram with Matplotlib’s hist() function
The second plot type we will make using. Matplotlib is. histogram with Matplotlib’s hist() function. A. simple histogram is useful for visualizing the distribution of a single variable.
We generate random. numbers from normal distribution using Numpy’s random number generator class.
rng = np.random.default_rng(42) X = rng.normal(0, 1,100)
Now we can make a histogram. using the data as argument to hist() function in Matplotlib.
# make a histogram. with matplotlib plt.hist(X) # set x and y axis labels plt.xlabel("X", size=16) plt.ylabel("Count", size=16) plt.title("Matplotlib hist() example", size=20) plt.savefig("histogram_with_Matplotlib_hist.png", format='png',dpi=150)
And this is how the histogram of our data looks like.
3. Scatter plot with Matplotlib’s scatter() function
We can make scatter plot between two numerical variables using scatter(). function in Matplotlib. In the example below, we generate random numbers from uniform distribution and create X and Y variables for making a scatter plot.
rng = np.random.default_rng() X = rng.uniform(0,1,100) Y = rng.uniform(0,1,100)
Matplotlib’s scatter() function takes the two variables as arguments and make a scatter plot.
plt.scatter(X,Y) plt.xlabel("X", size=16) plt.ylabel("Y", size=16) plt.title("Matplotlib scatter()", size=20) plt.savefig("scatterplot_with_Matplotlib_scatter.png", format='png',dpi=150)
By default Matplotlib makes scatter plot with blue dots as shown below.
4. Bar plot with Matplotlib’s bar() function
We can make a barplot using Matplotlib’s bar() function. To make a barplot, we generate sequence of numbers for x-axis and random number from uniform distribution for y-axis.
rng = np.random.default_rng(42) X = np.arange(1,11) Y = rng.uniform(1,10,10)
Matplotlib’s bar() function takes the X and Y variables we created as arguments to make a barplot.
plt.bar(X,Y) plt.xlabel("X", size=16) plt.ylabel("Count", size=16) plt.title("Matplotlib bar()", size=20) plt.savefig("barplot_with_Matplotlib_bar.png", format='png',dpi=150)
5. Boxplot with Matplotlib’s boxplot() function
Matplotlib’s boxplot() is useful in quickly making a rudimentary boxplots. Matplotlib’s boxplot() takes in Numpy array or a sequence of vectors as input to make boxplots.
In the example below, we create a numpy array with 3 variables or columns using Numpy’s Random number generator class.
rng = np.random.default_rng(42) fig,ax=plt.subplots() X = rng.normal((3,10,5),(1,2,3),(100,3))
Using the Numpy 2-d array as input to boxplot() function we make the boxplot.
plt.boxplot(X) plt.xlabel("Group", size=16) plt.title("Matplotlib boxplot()", size=20) plt.savefig("boxplot_with_Matplotlib_boxplot.png", format='png',dpi=150)<ins datetime="2023-01-14T08:40:23+00:00">
6. Violinplot with Matplotlib’s violinplot() function
Violinplot a variant of boxplot is often more suitable than a boxplot. With violinplot() function in Matplotlib, we can make violin plot.
rng = np.random.default_rng(42) X = rng.normal((3,10,6),(1,2,3),(100,3)) X[0:5,]
We use 2d numpy array with 3 groups or columns to make the violin plot.
array([[ 3.30471708, 7.92003179, 8.25135359], [ 3.94056472, 6.09792962, 2.09346148], [ 3.1278404 , 9.36751482, 5.94959653], [ 2.14695607, 11.75879595, 8.33337581], [ 3.0660307 , 12.25448241, 7.40252803]])
In the example below, we make a violin plot showing the median value for each group.
plt.violinplot(X, showmedians=True) plt.xlabel("Group", size=16) plt.title("Matplotlib Violinplot()", size=20) plt.savefig("violinplot_with_Matplotlib_violinplot.png", format='png',dpi=150)
7. Heatmap with Matplotlib’s imshow() function
We can make a simple heatmap showing the values of a 2d array as colors using Matplotlib’s imshow() function.
First we create a 2d Numpy array using random numbers from uniform distribution. And then provide the 2d array as argument to imshow() function to make a heatmap.
rng = np.random.default_rng(42) X = rng.uniform(0,1,(6,6)) plt.imshow(X) # set x and y axis labels plt.xlabel("X", size=16) plt.ylabel("Y", size=16) plt.title("Matplotlib imshow() Example", size=20) plt.savefig("Heatmap_with_Matplotlib_imshow.png", format='png',dpi=150)
8. 2-dimensional histogram with Matplotlib’s hist2d() function
Two dimensional histograms can be useful when you want to understand the relationship between two quantitative variables in large numbers.
With Matplotlib’s hist2d() function we can make 2d histograms, where each pixel is colored based on the counts of the two variables
plt.style.use('fivethirtyeight') # make data: correlated + noise rng=np.random.default_rng() X = rng.normal(0, 1, 10000) Y = 1.1 * X + rng.normal(0,1,10000)/2 plt.hist2d(X, Y, bins=100, cmap='Reds') plt.xlabel("X", size=14) plt.ylabel("Y", size=14) plt.title("Matplotlib hist2d()", size=18) plt.tight_layout() plt.savefig("twoD_histogram_with_Matplotlib_hist2d.png", format='png',dpi=150)