Do you love working with Python, but just can’t get enough of ggplot, R Markdown or any other tidyverse packages. You are not alone, many love both R and Python and use them all the time. Now RStudio, has made reticulate package that offers awesome set of tools for interoperability between Python and R. One […]
How to Split Text in a Column in Data Frame in R?
Very often you may have to manipulate a column of text in a data frame with R. You may want to separate a column in to multiple columns in a data frame or you may want to split a column of text and keep only a part of it. tidyr’s separate function is the best […]
How To Change Column Names and Row Indexes in Pandas?
One of the most common operations one might do while cleaning the data or doing exploratory data analysis in doing data science is manipulating/fixing the column names or row names. In this post, we will see How to rename columns of pandas dataframe? How to change row names or row indexes of a pandas dataframe? […]
PCA Example in Python with scikit-learn
Principal Component Analysis (PCA) is one of the most useful techniques in Exploratory Data Analysis to understand the data, reduce dimensions of data and for unsupervised learning in general. Let us quickly see a simple example of doing PCA analysis in Python. Here we will use scikit-learn to do PCA on a simulated data. Let […]
How To Plot Ridgeline Plots in R?
Ridgeline plots is a great way to visualize changes in multiple distributions/histogram either over time or space. It was initially called as joyplots, for a brief time. ggridges package from UT Austin professor Claus Wilke lets you make ridgeline plots in combinaton with ggplot. Here is how Claus describes the ridgeline plot with a brief […]