How To Categorize Multiple Numerical Columns in R

Categorize columns with across()

Recently I had to convert a numerical matrix into categorical ones based on some conditions. Obviously there are multiple ways to go about. One of the key functions to categorize a numerical vector in R is to use cut() function, that allows to specify the intervals to categorize a numerical variable. Till now I was… Continue reading How To Categorize Multiple Numerical Columns in R

7 Tips to Add Columns to a DataFrame with add_column() in tidyverse

Often while doing data analysis, one might create a new column or multiple columns to an existing data frame. In this post we will learn how to add one or more columns to a dataframe in R. tibble package in tidyverse, has a lesser known, but powerful function add_column(). We will learn 6 tips to… Continue reading 7 Tips to Add Columns to a DataFrame with add_column() in tidyverse

tidyr’s pivot_longer(): Reshape Wide Data to Long/Tidy Data

pivot_longer(): tidyr

One of the most common activities while doing data analysis is to reshape data from one form to another. For human eyes and data collection, often it is easier to work with data in wider form. However, for analyzing data it is more convenient to have the data in tidy/long form in most circumstances. tidyr,… Continue reading tidyr’s pivot_longer(): Reshape Wide Data to Long/Tidy Data

How to Compute Summary Statistics Across Multiple Columns in R

dplyr’s groupby() function lets you group a dataframe by one or more variables and compute summary statistics on the other variables in a dataframe using summarize function. Sometimes you might want to compute some summary statistics like mean/median or some other thing on multiple columns. Naive approach is to compute summary statistics by manually doing… Continue reading How to Compute Summary Statistics Across Multiple Columns in R