Empirical cumulative distribution function (ECDF) in Python

Histograms are a great way to visualize a single variable. One of the problems with histograms is that one has to choose the bin size. With a wrong bin size your data distribution might look very different. In addition to bin size, histograms may not be a good option to visualize distributions of multiple variables… Continue reading Empirical cumulative distribution function (ECDF) in Python

Book Review – Data Visualization: A Practical Introduction

Data Visualization, A practical introduction

Data Visualization: A Practical Introduction by Duke University Professor Kieran Healy is a great introduction Data Visualization. If you have not heard of the book before, here is a little back story. The author, Kieran Healy developed the book using R Bookdown and made the whole book available online for free. Yes, it is available… Continue reading Book Review – Data Visualization: A Practical Introduction

Publication Quality Graphics in #rstats

The visualization guru, Edward Tufte, known for all things visualization, tweeted that #rstats alone is not good enough for phublication quality graphics. He claimed “Publication-quality work requires: R + Adobe Illustrator + reasoning about words on graphics + respect for audience/readers/viewers “. #Rstats coders and users just can’t do words on graphics and typography. Proof:… Continue reading Publication Quality Graphics in #rstats

How to Make Boxplot in R with ggplot2?

Boxplot with jittered Data Points in R

One of many strengths of R is the tidyverse packages and the ability to make great looking plots easily. Boxplot or Box and Whisker plot, introduced by John Tukey is great for visualizing data from multiple groups/ distributions. Boxplot allows you to actually display the data together with efficient summary of the data using min,… Continue reading How to Make Boxplot in R with ggplot2?

Data Visualization with R, A New Online Book

Just recently wrote a post on 13 awesome Free Books to learn Data Science and R. And that did not last long. It is not just 13 anymore :). It is time to update the list of awesome data science books/resources available online freely. Claus Wilke, a professor from UT Austin has just announced a… Continue reading Data Visualization with R, A New Online Book