In this tutorial as part of our Pandas 101 series, we will learn how to compute cumulative sum of a column based on values from a grouping column in Pandas dataframe. Pandas cumsum() function can compute cumulative sum over a DataFrame, In this example we are interested getting cumulative sum of just one column by […]
Python Tips
Pandas pipe function in Pandas: performing PCA
Pandas pipe function can help us chain together functions that takes either dataframe or series as input. In this introductory tutorial, we will learn how to use Pandas pipe method to simplify code for data analysis. We start with a dataframe as input and do a series of analysis such that that each step takes […]
How to Change Matplotlib Plot’s Style
In this post we will learn how to find all available style options for matplotlib plot themes and learn to set a style for matplotlib plot. To illustrate the styling options available in Matplotlib, we will use histogram made from beta distributions. To get started, let us load the modules needed. We can use style […]
How to Select Columns/Rows by substring match in Pandas
In this post, we will learn how to select columns of a Pandas dataframe or a rows of a dataframe based on substring match in Pandas. We will use Pandas filter() function with argument “like” to select columns/rows, whose names partially match with a string of interest. Let us load the necessary modules. We are […]
Barplots and Countplot with Seaborn’s catplot
Love it or hate it, barplots are often useful in a quick exploratory data analysis to understand the variables in a dataset. In this post, we will see multiple examples on how to make barplots/countplot using Seaborn’s catplot() function. A couple of years ago Seaborn introduced catplot() function that provides a common framework to make […]