Pandas case_when() with multiple examples

The newest Pandas release Pandas 2.2.0 has one of the most useful functions case_when() available on a Pandas Series object. Often you might want to create a new variable from an existing variable using multiple conditions. For a simple binary condition we can use Pandas’ where() function. With the new case_when() function we can apply… Continue reading Pandas case_when() with multiple examples

Difference between Pandas where() function and mask() function

Pandas mask() and where() functions are two related functions that are useful in Pandas to find if elements of Pandas dataframe satisfy a condition. They both preserve the shape of the dataframe. In this post, we will first see simple examples of using Pandas where() and mask() functions and then we will learn the key… Continue reading Difference between Pandas where() function and mask() function

Pandas create new column using if else condition

Add new column using i condition on. existing column in Pandas

In this quick tutorial, we will learn how to create a new column using if else condition on an existing column in a Pandas dataframe. To add new column using a condional on existing column we will use Numpy’s where function. So, let us load both numby and Pandas to get started. We. will use… Continue reading Pandas create new column using if else condition

2 Ways to Randomly Sample Rows from a large CSV file

In this post, we will be learning how to randomly sample/select rows from a large CSV file that is either taking too long to load as a Pandas dataframe or can’t load at all. The key idea is to not to load the whole file as a Pandas dataframe. Instead, we use skiprows argument in… Continue reading 2 Ways to Randomly Sample Rows from a large CSV file