In simpler statistical models, we typically assume our data came from a single distribution. For example, to model height, we can assume that each observation came from a single Gaussian distribution with some mean and variance. However, often we might be in a scenario where that assumption is not valid and our data is more […]
Python
Pandas filter(): Select Columns and Rows by Labels in a Dataframe
In this post, we will learn how to use Pandas filter() function to subset a dataframe based on its column names and row indexes. Pandas has a number of ways to subset a dataframe, but Pandas filter() function differ from others in a key way. Pandas filter() function does not filter a dataframe on its […]
How To Delete Rows in Pandas Dataframe
Pandas make it easy to delete rows of a dataframe. There are multiple way to delete rows or select rows from a dataframe. In this post, we will see how to use drop() function to drop rows in Pandas by index names or index location.. Pandas drop() function can also be used drop or delete […]
How to Combine Year, Month, and Day Columns to single date in Pandas
In this post, we will see how to combine columns containing year, month, and day into a single column of datetime type. We can combine multiple columns into a single date column in multiple ways. First, we will see how can we combine year, month and day column into a column of type datetime, while […]
How to Convert a Column to Datetime type with Pandas
Pandas in Python has numerous functionalities to deal with time series data. One of the simplest tasks in data analysis is to convert date variable that is stored as string type or common object type in in Pandas dataframe to a datetime type variable. In this post we will see two ways to convert a […]