One of the nice things about twitter, when you follow awesome people, is that you will come across tweets that will just blow your mind. Last week is just one such week with some fantastic and funniest tweetorials. One of the tweetorials was from Prof. Daniela Witten for @WomenInStat. And it starts like this and […]
dplyr mutate(): Create New Variables with mutate
dplyr, R package part of tidyverse suite of packages, provides a great set of tools to manipulate datasets in the tabular form. dplyr has a set of core functions for “data munging”,including select(), mutate(), filter(), summarise(), and arrange(). And in this tidyverse tutorial, a part of tidyverse 101 series, we will learn how to use […]
dplyr select(): Select one or more variables from a dataframe
dplyr, R package part of tidyverse, provides a great set of tools to manipulate datasets in the tabular form. dplyr has a set of core functions for “data munging”. Here is the list of core functions from dplyr select() picks variables based on their names. mutate() adds new variables that are functions of existing variables […]
How To Change Pandas Column Names to Lower Case
Cleaning up the column names of a dataframe often can save a lot of headaches while doing data analysis. In this post, we will learn how to change column names of a Pandas dataframe to lower case. And then we will do additional clean up of columns and see how to remove empty spaces around […]
Pandas Groupby and Sum
A common step in data analysis is to group the data by a variable and compute some summary statistics each subgroup of data. For example, one might be interested in mean, median values, or total sum per group. In this post, we will see an example of how to use groupby() function in Pandas to […]



