Often while working with a bigger pandas dataframe with multiple columns, one wants to drop a column or multiple columns from a pandas dataframe. One typically drops columns, if the columns are not needed for further analysis. Pandas drop function allows you to drop/remove one or more columns from a dataframe. Let us see some […]
Python Tips
How To Concatenate Arrays in NumPy?
Often you may have two or more NumPY arrays and want to concatenate/join/merge them into a single array. Python offers multiple options to join/concatenate NumPy arrays. Common operations include given two 2d-arrays, how can we concatenate them row wise or column wise. NumPy’s concatenate function allows you to concatenate two arrays either by rows or […]
How To Change Column Names and Row Indexes in Pandas?
One of the most common operations one might do while cleaning the data or doing exploratory data analysis in doing data science is manipulating/fixing the column names or row names. In this post, we will see How to rename columns of pandas dataframe? How to change row names or row indexes of a pandas dataframe? […]
PCA Example in Python with scikit-learn
Principal Component Analysis (PCA) is one of the most useful techniques in Exploratory Data Analysis to understand the data, reduce dimensions of data and for unsupervised learning in general. Let us quickly see a simple example of doing PCA analysis in Python. Here we will use scikit-learn to do PCA on a simulated data. Let […]
How to Make Boxplots in Python with Pandas and Seaborn?
Boxplot, introduced by John Tukey in his classic book Exploratory Data Analysis close to 50 years ago, is great for visualizing data distributions from multiple groups. Boxplot captures the summary of the data efficiently with a simple box and whiskers and allows us to compare easily across groups. Boxplots summarizes a sample data using 25th, […]