Very often you may have to manipulate a column of text in a data frame with R. You may want to separate a column in to multiple columns in a data frame or you may want to split a column of text and keep only a part of it. tidyr’s separate function is the best […]
R
How To Plot Ridgeline Plots in R?
Ridgeline plots is a great way to visualize changes in multiple distributions/histogram either over time or space. It was initially called as joyplots, for a brief time. ggridges package from UT Austin professor Claus Wilke lets you make ridgeline plots in combinaton with ggplot. Here is how Claus describes the ridgeline plot with a brief […]
How To Generate Random Numbers from Probability Distributions in R?
Understanding probability distributions and how one can simulate random numbers from a specific probability distribution is very useful in understanding probability and use them effectively in doing data science. Here we will be looking at how to simulate/generate random numbers from 9 most commonly used probability distributions in R and visualizing the 9 probability distributions […]
Skimr: A R Package to Skim Summary Data Effortlessly
Exploring your data while dong analysis is extremely important. skimr, an R package, from rOpenSci is a great package that helps you get the summary statistics in a nice way, so you can quickly skim your data summary and understand it better. If you have not heard of rOpenSci, it is a non-profit initiative founded […]
5 Big Ideas Behind Tidy Evaluation
Ever wondered, how easy it is to write dataframe manipulation code without repeating yourself while using dplyr ? For example, if you are filtering a dataframe, you simply write instead of writing like this where you need to refer the dataframe multiple times and use “$” to access variables in the dataframe. The reason why […]