Scatter plots are extremely useful to analyze the relationship between two quantitative variables in a data set. Often datasets contain multiple quantitative and categorical variables and may be interested in relationship between two quantitative variables with respect to a third categorical variable. And coloring scatter plots by the group/categorical variable will greatly enhance the scatter […]
Introduction to nest() in tidyr
Grouping our data in specific ways and analyzing is often the most common way to make interesting observations about the data. R tidyverse offers fantastic tool set to analyze data by grouping in different ways. Tidyverse dplyr’s group_by() is one of the basic verbs that is extremely useful in most common data analyis scenarios. nest() […]
How to Recode a Column with dplyr in R?
Sometimes, when working with a dataframe, you may want the values of a variable/column of interest in a specific way. You might like to change or recode the values of the column. R offers many ways to recode a column. Here we will see a simple example of recoding a column with two values using […]
How To Select Columns by Data Type in Pandas?
Often when you are working with bigger dataframe and doing some data cleaning or exploratory data analysis, you might want to select columns of Pandas dataframe by their data types. For example, you might want to quickly select columns that are numerical in type and visualize their summary data. Or you might want to select […]
How To Make Scatter Plot in Python with Seaborn?
Scatter plots are a useful visualization when you have two quantitative variables and want to understand the relationship between them. In this post we will see examples of making scatter plots using Seaborn in Python. We will first make a simple scatter plot and improve it iteratively. Let us first load the packages we need […]