• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / Python / How to Highlight Data Points with Colors and Text in Python

How to Highlight Data Points with Colors and Text in Python

November 5, 2019 by cmdlinetips

Sometimes you might want to highlight a select data points on a scatter plot. Often when plotting scatter plots you might want to highlight data points in a different color from the rest of the data points. Other times you want to show select data points in different color and annotate them with text.

In this post, we will use matplotlib in Python and see examples of

  1. How to highlight select data points in a scatter plot with different color?
  2. How to highlight select data points with different color and annotate with text?

Let us load the packages we need.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt 

We will use gapminder data from Carpentries website. We will load the data into Pandas dataframe by directly reading from the website.

data_url = 'http://bit.ly/2cLzoxH'
# read data from url as pandas dataframe
gapminder = pd.read_csv(data_url)
print(gapminder.head(3))

Let us make a simple scatter plot using matplotlib. We use scatter function in matplotlib to make scatter plot between lifeExp values on x-axis and gdpPercap values on y=axis.

# scatter plot with matplotlib in Python
plt.scatter(gapminder.lifeExp, gapminder.gdpPercap, 
    alpha=0.5)
# set x-axis label and specific size
plt.xlabel('lifeExp',size=16)
# set y-axis label and specific size
plt.ylabel('gdpPercap',size=16)
# set plot title with specific size
plt.title('Scatter Plot with Python',size=16)
# save the plot as PNG file with dpi=150
plt.savefig('Scatter_Plot_with_Matplotlib_in_Python.png', dpi=150)

Our scatter plot between life expectancy values and gdpPercap values looks like this. Scales of x and y axes are different and that there are few data points with large gdpPercap values stand out as outliers.

Scatter Plot with Matplotlib in Python
Scatter Plot with Matplotlib in Python

The outlier data points make the scatter plot really intriguing and you might want to highlight the outliers in a different color. Let us highlight the select data points with a specific color that is different from other data points’ color.

To highlight data points, we first need to programmatically identify the outlier data points. We can see that those data points all have gdpPercap values more than 50000 and also lifeExp values greater than 50.

Let us subset our gapminder data set to contain only the select data points and store that in a different data frame. We can use Pandas’ query() function to subset the interesting data points as follows.

df = gapminder.query('gdpPercap >=50000 & lifeExp >50')

Now that we have the data points that we want to highlight in a specific color, we are ready to write code highlight them with matplotlib in Python.

To start with we will first make scatter plot exactly as before between the two variables. Here we have just changed the title of the plot.

# scatter plot with matplotlib in Python
plt.scatter(gapminder.lifeExp, gapminder.gdpPercap,  alpha=0.5)
# set x-axis label and specific size
plt.xlabel('lifeExp',size=16)
# set y-axis label and specific size
plt.ylabel('gdpPercap',size=16)
# set plot title with specific size
plt.title('How To Highlight Data Points with colors \n in Python with matplotlib?',size=16)

Now we can add new layer of scatter plot, but this time only with the specific data points that we want to highlight. We have already subsetted the data points in a new dataframe “df”. We can again use scatter() function, but this time with the data from the subsetted dataframe df. We also specify the color we want, here we specify the color to be red.

plt.scatter(df.lifeExp, df.gdpPercap, color="red")
#plt.show()
plt.savefig('How_To_Highlight_Data_Points_with_Color_Python.png', dpi=150)

Now we have highlighted the select data points, in this case outliers, in red color on a scatter plot.

How To Highlight Data Points with Color Python?
How To Highlight Data Points with Color Python?

How to Highlight Data Points with Color and Text in Python?

In the above plot, it may better to actually plot more information about the outliers. For example, it will be good to know the countries of outlier data points that we colored.

Let us repeat the same code as above to highlight the data points using using a subsetted dataframe.

df = gapminder.query('gdpPercap >=50000 & lifeExp >50')
plt.scatter(gapminder.lifeExp, gapminder.gdpPercap,  alpha=0.5)
# set x-axis label and specific size
plt.xlabel('lifeExp',size=16)
# set y-axis label and specific size
plt.ylabel('gdpPercap',size=16)
plt.title('How To Highlight Data with Color and Text \n with matplotlib in Python?',size=14)
plt.scatter(df.lifeExp, df.gdpPercap, color="red")

Then, we need to add the text, here the name of country, for each outlier data points. Matplotlib has a function named annotate() to add text in a specific location in a plot. We need to specify annotate() function the text we want to annotate the plot with and the x and y co-ordinates for the location of the text.

In our example, the text we want to annotate is the name of the country and the x and y coordinates of the text is approximately our lifeExp and gdpPercap values.

Since we have data points to annotate, we need to loop through each data point of interest. Here we use for loop on the select data points and add text to each data point.

# loop through to annotate multiple datapoints
for i in range(df.shape[0]):
    plt.annotate(df.country.tolist()[i], (df.lifeExp.tolist()[i], df.gdpPercap.tolist()[i]))
print(df.shape[0])
plt.tight_layout()
#plt.show()
plt.savefig('How_To_Highlight_Data_Annotate_with_Text_Python.png', dpi=150)

Here is the plot we wanted. We have highlighted the select data points in red and annotated each data point with text, here name of country.

How To Highlight Data Annotate with Text Python?
How To Highlight Data Annotate with Text Python?

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

How To Highlight Select Data Points with ggplot2 in R? How To Specify Colors to Scatter Plots in Python Boxplots with Specific ColorsHow To Specify Colors to Boxplots in Seaborn? Adjusting Transparency in Scatter PlotHow To Make Scatter Plot in Python with Seaborn?

Filed Under: Annotate matplotlib, Highlight Data Points in Color Python, Python, Python Tips Tagged With: Highlight select points in Python

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version