Often you may deal with large matrices that are sparse with a few non-zero elements. In such scenarios, keeping the data in full dense matrix and working with it is not efficient. A better way to deal with such sparse matrices is to use the special data structures that allows to store the sparse data […]
Book Review: Fundamentals of Data Visualization
Finally got a chance to write down quick thoughts on Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures by Claus Wilke. ICYMI, Fundamentals of Data Visualization is a fantastic book on data visualization that was developed openly, freely available and just recently the physical book is available for purchase. I have […]
Singular Value Decomposition (SVD) in Python
Matrix decomposition by Singular Value Decomposition (SVD) is one of the widely used methods for dimensionality reduction. For example, Principal Component Analysis often uses SVD under the hood to compute principal components. In this post, we will work through an example of doing SVD in Python. We will use gapminder data in wide form to […]
How To Do PCA in tidyverse Framework?
In an earlier post, we saw a tutorial on how to do PCA in R using gapminder data set. Another interesting way of doing PCA is to follow the tidyverse framework. In this post, we will see an example of doing PCA analysis using gapminder data in a tidy framework. Being the first attempt to […]
How To Create a Column Using Condition on Another Column in Pandas?
Often while cleaning data, one might want to create a new variable or column based on the values of another column using conditions. In this post we will see two different ways to create a column based on values of another column using conditional statements. First we will use NumPy’s little unknown function where to […]



