A common step in data analysis is to group the data by a variable and compute some summary statistics each subgroup of data. For example, one might be interested in mean, median values, or total sum per group. In this post, we will see an example of how to use groupby() function in Pandas to […]
Pandas
Pandas Groupby and Computing Median
One of the common operations of data analysis is group the data by a variable and compute some sumamry statistics on the sub-group of data. In this post, we will see an example of how to use groupby() function in Pandas to group a dataframe into multiple smaller dataframes and compute median on another variable […]
Fun with Pandas Groupby, Aggregate, Multi-Index and Unstack
This post is titled as “fun with Pandas Groupby, aggregate, and unstack”, but it addresses some of the pain points I face when doing mundane data-munging activities. Every time I do this I start from scratch and solved them in different ways. The purpose of this post is to record at least a couple of […]
Pandas 1.0.0 is Here: Top New Features of Pandas You Should Know
Pandas 1.0.0 is ready for prime time now. Pandas project has come a long way since the early release of Pandas version 0.4 in 2011. It had contributions from 2 developers including Wes Kinney then, now Pandas has over 300 contributors. The latest version of Pandas can be installed from standard package managers like Anaconda, […]
11 Tips to Make Plots with Pandas
Python Pandas library is well known for its amazing data munging capabilities. However, a little underused feature of Pandas is its plotting capabilities. Yes, one can make better visualizations with Matplotlib or Seaborn or Altair. However, Pandas plotting capabilities can be extremely handy when you are in exploratory data analysis mode and want to quickly […]