Sampling, randomly sub-setting, your data is often extremely useful in many situations. If you are interested in randomly sampling without regard to the groups, we can use sample_n() function from dplyr. Sometimes you might want to sample one or multiple groups with all elements/rows within the selected group(s). However, sampling one or more groups with […]
tidyverse
How To Separate a Column into Multiple Rows with in R?
I just came across a useful little function in tidyr called separate_rows(). Often you may have a data frame with a column containing multiple information concatenated together with a delimiter. For example, we might have data frame with members of a family in a column separated by a delimiter. Here is a pictorial representation of […]
How To Do PCA in tidyverse Framework?
In an earlier post, we saw a tutorial on how to do PCA in R using gapminder data set. Another interesting way of doing PCA is to follow the tidyverse framework. In this post, we will see an example of doing PCA analysis using gapminder data in a tidy framework. Being the first attempt to […]
Introduction to nest() in tidyr
Grouping our data in specific ways and analyzing is often the most common way to make interesting observations about the data. R tidyverse offers fantastic tool set to analyze data by grouping in different ways. Tidyverse dplyr’s group_by() is one of the basic verbs that is extremely useful in most common data analyis scenarios. nest() […]
Book Review – Data Visualization: A Practical Introduction
Data Visualization: A Practical Introduction by Duke University Professor Kieran Healy is a great introduction Data Visualization. If you have not heard of the book before, here is a little back story. The author, Kieran Healy developed the book using R Bookdown and made the whole book available online for free. Yes, it is available […]