Often when you are working with bigger dataframe and doing some data cleaning or exploratory data analysis, you might want to select columns of Pandas dataframe by their data types. For example, you might want to quickly select columns that are numerical in type and visualize their summary data. Or you might want to select […]
How To Make Scatter Plot in Python with Seaborn?
Scatter plots are a useful visualization when you have two quantitative variables and want to understand the relationship between them. In this post we will see examples of making scatter plots using Seaborn in Python. We will first make a simple scatter plot and improve it iteratively. Let us first load the packages we need […]
How To Select Columns Using Prefix/Suffix of Column Names in Pandas?
Selecting one or more columns from a data frame is straightforward in Pandas. For example, if we want to select multiple columns with names of the columns as a list, we can one of the methods illustrated in How To Select One or More Columns in Pandas? Sometimes you may be working with a larger […]
How to Get Top N Rows with in Each Group in Pandas?
In this post we will see how to get top N rows from a data frame such that the top values of a specific variable in each group defined by another variable. Note this is not the same as top N rows according to one variable in the whole dataframe. Let us say we have […]
Getting started with ggforce – a ggplot2 extension package
ggforce, R package extension for ggplot, has got a big upgrade with lot of new functions. ggforce was introduced about to years ago with the aim to provide missing functionalities in ggplot2. ggforce provides a a repository of geoms, stats, etc. that are as well documented and implemented as the official ones found in ggplot2. […]


