Pandas in Python has numerous functionalities to deal with time series data. One of the simplest tasks in data analysis is to convert date variable that is stored as string type or common object type in in Pandas dataframe to a datetime type variable. In this post we will see two ways to convert a […]
Introduction to Kernal PCA with Python
Principal Component Analysis is one of the bread and butter dimensionality reduction methods for unsupervised learning. One of the assumptions of PCA is that the data is linearly separable. Kernal PCA, is a variant of PCA that can handle non-linear data and make it linearly separable. If you wonder what is linearly separable, Python Machine […]
4 Tidyverse Tips for Future Self: case_when(), fct_relevel(), fct_recode(), scale_fill_brewer()
Here are four tidyverse tips for future self. These four tips/functions from tidyverse suite are a few of really simple things that I need often, but I always have to google and often struggle to come up with the search phrase. The first tip is very simple and extremely useful function case_when() from dplyr package. […]
Linear Regression Analysis with statsmodels in Python
Linear Regression is one of the most useful statistical/machine learning techniques. And we have multiple ways to perform Linear Regression analysis in Python including scikit-learn’s linear regression functions and Python’s statmodels package. statsmodels is a Python module for all things related to statistical analysis and it provides classes and functions for the estimation of many […]
Introduction to Data Cleaning with Pyjanitor
Data cleaning is one of the most common and important tasks of any data analysis. In typical data analysis setting, we would might get our dataset from excel/csv/tsv file and perform a series of operations to make the data cleaner. For example, we would start with cleaning the names of variables to make it consistent, […]