In this post, we will learn how to randomly sample rows from a data frame that is useful in most common scenarios. Tidyverse has a few options to randomly sample rows from a dataframe. slice_sample() in dplyr is the currently recommended function to use for randomly select rows. The older function in dplyr, sample_n(), for… Continue reading 13 Tips to Randomly Select Rows with tidyverse
Category: tidyverse 101
dplyr matches(): select columns using regular expression
This quick post has an example using a neat dplyr function matches() to select columns using regular expressions. dplyr has a number of helper functions, contains(), starts_with() and others, for selecting columns based on certain condition. For example if you interested selecting columns based on how its starts with we can use start_with() function. However,… Continue reading dplyr matches(): select columns using regular expression
How to create a list of plot objects and save them as files
In this post, we will learn a really nice trick on creating multiple ggplots from a dataframe and saving the plots into files using ggsave, using tidyverse purrr’s magic. We will use Purrr’s map function to create multiple plots from a dataframe and use another Purrr function pwalk to save the plots as files. Learned… Continue reading How to create a list of plot objects and save them as files
How to Replace Multiple Column Names of a Dataframe with tidyverse
Of late, I am renaming column names of a dataframe a lot, in different flavors, in R using tidyverse. And every time I have to google it up :). Just came across, a really neat trick from Shannon Pileggi on twitter to replace multiple column names using deframe() function and !!! splice operator. Here is… Continue reading How to Replace Multiple Column Names of a Dataframe with tidyverse
How to Replace NAs with column mean or row means with tidyverse
Just a quick rstat post on a simple imputation approach here for the future self. SVD/PCA is one of the first things I do for analyzing any new high dimensional data. Often such data are messy and have some missing values. Depending on the situation, I often resort to removing the rows with missing data… Continue reading How to Replace NAs with column mean or row means with tidyverse