Sometimes, while doing data wrangling, we might need to get a quick look at the top rows with the largest or smallest values in a column. This kind of quick glance at the data reveal interesting information in a dataframe. Pandas dataframe easily enables one to have a quick look at the top rows either […]
How To Write Pandas GroupBy Function using Sparse Matrix?
Pandas group-by function that helps perform the split-apply-combine pattern on data frames is bread and better for data wrangling in Python. Just came across a really cool blogpost titled “Group-by from scratch” by Jake Vanderplas, the author of Python Data Science Handbook. Jake implements multiple ways to implement group-by from scratch. It is a must […]
Happy Pi(e) Day: How To Make Pie Chart in R and Python? (but Never Make it)
Happy Pi(e) Day! Pi Day is for celebrating the mathematical constant ? (pi) and it is on March 14 (3/14). It is also Albert Einstein’s birthday! Today is probably the only day you can think of making a Pie Chart. Pie Chart has been around for a while and notorious for eye-candy but misleading plots. […]
Introduction to Maximum Likelihood Estimation in R – Part 2
Maximum likelihood is a very general approach developed by R. A. Fisher, when he was an undergrad. In an earlier post, Introduction to Maximum Likelihood Estimation in R, we introduced the idea of likelihood and how it is a powerful approach for parameter estimation. We learned that Maximum Likelihood estimates are one of the most […]
Catplot Python Seaborn: One Function to Rule All Plots With Categorical Variables
I just discovered catplot in Seaborn. Catplot is a relatively new addition to Seaborn that simplifies plotting that involves categorical variables. In Seaborn version v0.9.0 that came out in July 2018, changed the older factor plot to catplot to make it more consistent with terminology in pandas and in seaborn. The new catplot function provides […]