In this post we will see how to get top N rows from a data frame such that the top values of a specific variable in each group defined by another variable. Note this is not the same as top N rows according to one variable in the whole dataframe. Let us say we have […]
Python Tips
How To Select One or More Columns in Pandas?
Selecting a column or multiple columns from a Pandas dataframe is a common task in exploratory data analysis in doing data science/munging/wrangling. In this post, we will see examples of How to select one column from Pandas dataframe? How to select multiple columns from Pandas dataframe? Let us first load Pandas library Let us use […]
Catplot Python Seaborn: One Function to Rule All Plots With Categorical Variables
I just discovered catplot in Seaborn. Catplot is a relatively new addition to Seaborn that simplifies plotting that involves categorical variables. In Seaborn version v0.9.0 that came out in July 2018, changed the older factor plot to catplot to make it more consistent with terminology in pandas and in seaborn. The new catplot function provides […]
How To Make Grouped Boxplots in Python with Seaborn?
Boxplots are one of the most common ways to visualize data distributions from multiple groups. In Python, Seaborn potting library makes it easy to make boxplots and similar plots swarmplot and stripplot. Sometimes, your data might have multiple subgroups and you might want to visualize such data using grouped boxplots. Here, we will see examples […]
How To Merge/Join DataFrames with Pandas in Python?
One of the most common data science tasks – data munge/data cleaning, is to combine data from multiple sources. When you have the data in tabular forms, Python Pandas offers great functions to merge/join data from multiple data frames. Here we will see example scenarios of common merging operations with simple toy data frames. Let […]