Real datasets are messy and often they contain missing data. Python’s pandas can easily handle missing data or NA values in a dataframe. One of the common tasks of dealing with missing data is to filter out the part with missing values in a few ways. One might want to filter the pandas dataframe based […]
Introduction to Sparse Matrices in Python with SciPy
What is a Sparse Matrix? Imagine you have a two-dimensional data set with 10 rows and 10 columns such that each element contains a value. We can also call such data as matrix, in this example it is a dense 10 x 10 matrix. Now imagine, you have a 10 x 10 matrix with only […]
Crash Course on Machine Learning from Google for Free
Want to learn Machine Learning? Who else is better than Google. Google is offering a free online crash course on Machine Learning with the focus on using TensorFlow APIs. Is that Google Machine Learning Course For You? Google’s Machine Learning Crash Course is for the beginners and it is self contained. The course does not […]
Probability Distributions in Python with SciPy and Seaborn
If you are a beginner in learning data science, understanding probability distributions will be extremely useful. One of the best ways to understand probability distributions is simulate random numbers or generate random variables from specific probability distribution and visualizing them. 9 Most Commonly Used Probability Distributions There are at least two ways to draw samples […]
5 Big Ideas Behind Tidy Evaluation
Ever wondered, how easy it is to write dataframe manipulation code without repeating yourself while using dplyr ? For example, if you are filtering a dataframe, you simply write instead of writing like this where you need to refer the dataframe multiple times and use “$” to access variables in the dataframe. The reason why […]