Sometimes you may have a quantitative variable in your data set and you might want to discretize it or bin it or categorize it based on the values of the variable. For example, let us say you have measurements of height and want to discretize it such that it is 0 or 1 depending on […]
Pandas DataFrame
How to Add Group-Level Summary Statistic as a New Column in Pandas?
In this post, we will see an example adding results from one of aggregating functions like mean/median after group_by() on a specific column as a new column. In other words, we might have group-level summary values for a column and we might to add the summary values back to the original dataframe we computed group-level […]
How to Drop Rows Based on a Column Value in Pandas Dataframe?
In this post we will see examples of how to drop rows of a dataframe based on values of one or more columns in Pandas. Pandas drop function makes it really easy to drop rows of a dataframe using index number or index names. We can use Pandas drop function to drop rows and columns […]
Pandas groupby: 13 Functions To Aggregate
Pandas groupby function enables us to do “Split-Apply-Combine” data analysis paradigm easily. Basically, with Pandas groupby, we can split Pandas data frame into smaller groups using one or more variables. Pandas has a number of aggregating functions that reduce the dimension of the grouped object. In this post will examples of using 13 aggregating function […]
How To Drop Duplicate Rows in Pandas?
In this post, we will learn how to drop duplicate rows in a Pandas dataframe. We will use Pandas drop_duplicates() function to can delete duplicated rows with multiple examples. One of the common data cleaning tasks is to make a decision on how to deal with duplicate rows in a data frame. If the whole […]