• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / R / How to Write Functions to Make Plots with ggplot2 in R

How to Write Functions to Make Plots with ggplot2 in R

May 31, 2021 by cmdlinetips

Okay here is a confession. Often I preach writing functions to simplify life at work. Although I try to follow the “writing functions” mantra decently, there is a grey area where I don’t use functions that much. Any guesses? Often make tonnes of exploratory plots with ggplot2 with tons of repetition of code with slight change of variables.

It is not due to lack of attempts. I have tried writing functions to make plots using tidy evaluations (See evidence here). However, the habit never took off.

Here I am trying to reboot writing functions to make plots. Came across two fantastic resources, one from RStudio 2020 talk Best practices for programming with ggplot2 by Dewey Dunnington and a recent lecture by Claus Wilke. Here I am starting with simple a example, thanks to Claus Wilke’s lesson on functions and functional programming.

Here is an example of saving your self by writing functions to make plots with ggplot2.

library(tidyverse)
library(palmerpenguins)

Let us say want to make a plot using data data corresponding to a group i.e. a subset of data. In this example we use Palmer penguins data and make a plot for just one of the penguin species. Our natural approach would be to filter the data for the group of interest and make a plot.

penguins %>%
  filter(species == "Gentoo") %>%
  ggplot() +
  aes(bill_length_mm, body_mass_g, color=sex) +
  geom_point() +
  ggtitle("Species: Gentoo") +
  xlab("bill length (mm)") +
  ylab("body mass (g)") +
  theme(plot.title.position = "plot")

Scatter Plot with a lot code repetition
Scatter Plot with a lot code repetition

If we want to create a plot for different group, often we might repeat most of the code except for the part specifying the group of interest.

Here we make a plot for different penguin species. Note the code is almost the same except for the filter statement and ggtitle statement.

penguins %>%
  filter(species == "Chinstrap") %>%
  ggplot() +
  aes(bill_length_mm, body_mass_g, color=sex) +
  geom_point() +
  ggtitle("Species: Chinstrap") +
  xlab("bill length (mm)") +
  ylab("body mass (g)") +
  theme(plot.title.position = "plot")

Scatter Plot with a lot code repetition Example 2

Scatter Plot with a lot code repetition Example 2

We can avoid writing similar code by using variables and function. For example, instead of hard-coding the values for groups, we can create variable and use the variable name instead.

For example, we can create a new variable for specifying the sub-group of interest, in this case species of interest. And use the variable name while plotting and this helps us from writing similar code.

For example, we define new variable “species_choice” with the species of interest.

species_choice <- "Adelie"
penguins %>%
  filter(species == species_choice) 

Another trick in accessing variable of interest from different environment. This allows us to use the same variable name from different environment. For example, we can specify species of interest by

species <- "Adelie"

And subset the data using “species” name but from two different environment. Here we access species variable from the data using the pronoun “.data” and access species variable from the current working environment using “.env”.

penguins %>%
  filter(.data$species == .env$species)

Here “.data$species” gets us the column in the data frame, while “.env$species” is a variable in the local environment that we just created.

Now we can write a small function that takes in species name as input and make the plot. Note we use glue package trick to access variable name using curly braces around the variable of interest.

Glue offers interpreted string literals that are small, fast, and dependency-free. Glue does this by embedding R expressions in curly braces which are then evaluated and inserted into the argument string.

make_plot <- function(species) {
  penguins %>%
    filter(.data$species == .env$species) %>%
    ggplot() +
    aes(bill_length_mm, body_mass_g, color=sex) +
    geom_point() +
    ggtitle(glue("Species: {species}")) +
    xlab("bill length (mm)") +
    ylab("body mass (g)") +
    theme(plot.title.position = "plot")
}

We can call the function to make a plot for a single species.

make_plot("Adelie")

With the function to make plots ready, we can make plots for all species easily without repeating ourselves. Here we use “map” function takes each element of the vector species and uses it as input for make_plot(). And the resulting plots are stored in variable as a list.

species <- c("Adelie", "Chinstrap", "Gentoo")
plots <- map(species, make_plot)

We can get the plots from the list. We can get the first plot

plots[[1]]

and the second plot

plots[[2]]

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

dplyr select(): How to Select Columns?dplyr select(): Select one or more variables from a dataframe Tips to Customize Text in ggplot2 plot10 Tips to Customize Text Color, Font, Size in ggplot2 with element_text() Scatter plot tips: Color & Shape by variable9 Tips to Make Better Scatter Plots with ggplot2 in R Tips to Customize line elements in ggplot215 Tips to Customize lines in ggplot2 with element_line()

Filed Under: functions for ggplot2, glue package, R, R Tips, tidyverse 101 Tagged With: ggplot2, R

Reader Interactions

Trackbacks

  1. ?R?penguins???????????? – DiNOV says:
    June 6, 2021 at 6:47 pm

    […] ?????????????????????????? […]

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version