In this post, we will learn how to select columns of a Pandas dataframe or a rows of a dataframe based on substring match in Pandas. We will use Pandas filter() function with argument “like” to select columns/rows, whose names partially match with a string of interest. Let us load the necessary modules. We are […]
Python
How to lump factors in Pandas
Sometimes you would like to collapse least frequent values of a factor or character variable in to a new category “Other”. In R forcats library has a suit of functions for lumping the variables. This post contains a Pandas solution that can lump factors or values in three common ways. First, we will see how […]
Barplots and Countplot with Seaborn’s catplot
Love it or hate it, barplots are often useful in a quick exploratory data analysis to understand the variables in a dataset. In this post, we will see multiple examples on how to make barplots/countplot using Seaborn’s catplot() function. A couple of years ago Seaborn introduced catplot() function that provides a common framework to make […]
Python Built-in Datasets
Scikit-learn, a machine learning toolkit in Python, offers a number of datasets ready to use for learning ML and developing new methodologies. If you are new to sklearn, it may be little harder to wrap your head around knowing the available datasets, what information is available as part of the dataset and how to access […]
How to Change the Order of Columns in a Pandas Dataframe
In this tutorial, we will learn how to change the order of columns in Pandas dataframe. We can change the order of the columns in multiple. Here, we will see two ways to change the order of the columns. First, let us load Pandas. import pandas as pd We will use gapminder dataset to change […]