How To Randomly Select Rows in Pandas?: Pandas Tutorial

Pandas’ sample function lets you randomly sample data from pandas data frame. Here are three ways of using Pandas’ sample to randomly select rows. Let us first load the data. How to get a random subset of data To randomly select rows from a pandas dataframe, we can use sample function from pandas. For example, […]

Python 3 Guide for Data Scientists

In case you missed it, there won’t be any support Python 2 by 2020. The last Python 2 update was for Python 2.7. So if you are interested in Data Science and learning Python, start with Python 3. If you already program with Python 2, it is time to migrate to Python 3. Alex Rogozhnikov, […]

How to Get Frequency Counts of a Column in Pandas Dataframe: Pandas Tutorial

Often while working with pandas dataframe you might have a column with categorical variables, string/characters, and you want to find the frequency counts of each unique elements present in the column. Pandas’ value_counts() easily let you get the frequency counts. Let us get started with an example from a real world data set. Load gapminder […]

Installing Python 3 from Python 2 with Anaconda

If you have already installed Anaconda 2.7 and finally decided to take a plunge into Python 3 and want to install Python 3. Congrats. You don’t have to start fresh. You can easily upgrade to Python 3 using Anaconda package manager by creating new environment for Python 3. Note that this virtual environment is completely […]

How to Install Packages from the Jupyter Notebook?

Python package managers, like Anaconda and pip, have made our life much simpler working with Python in different operating systems. However, if you work long enough, you are likely to encounter weird installation problems. One such problem is even if you have installed a package, you won’t be able to import it in the Jupyter […]

How to Get Unique Values from a Column in Pandas Data Frame?

Often while working with a big data frame in pandas, you might have a column with string/characters and you want to find the number of unique elements present in the column. Pandas library in Python easily let you find the unique values. Let us get started with some examples from a real world data set. […]

How To Add a New Column to Using a Dictionary in Pandas Data Frame ?: Pandas Tutorial

Pandas library in Python has a really cool function called map that lets you manipulate your pandas data frame much easily. Pandas’ map function lets you add a new column with values from a dictionary if the data frame has a column matching the keys in the dictionary. Adding a New Column Using keys from […]

How to Compute Executing Time in Python?

It is really good to know whether the code you wrote is efficient or fast. We can test that by checking how long it takes to execute certain commands, or functions. Computing Execution Time With “time” Module One way to get the execution time is to use the built-in time module and its function time.time. […]

5 Examples Using Dict Comprehension in Python

List Comprehension is really a handy and faster way to write for loops in Python in just a single line of code. The idea of comprehension is not just unique to lists in Python. Dictionaries, one of the commonly used data structures in data science, can also do comprehension. This is called dict comprehension or […]

How to Load a Massive File as small chunks in Pandas?

The longer you work in data science, the higher the chance that you might have to work with a really big file with thousands or millions of lines. Trying to load all the data at once in memory will not work as you will end up using all of your RAM and crash your computer. […]