dplyr is one of the R packages developed by Hadley Wickham to manipulate data stored in data frames. Data frame is a two-dimensional data structure, where each column can contain a different type of data, like numerical, character and factors. In case you wondered the meaning of the word “dplyr”, it is like “pliers” for […]
How to Compute Executing Time in Python?
It is really good to know whether the code you wrote is efficient or fast. We can test that by checking how long it takes to execute certain commands, or functions. Computing Execution Time With “time” Module One way to get the execution time is to use the built-in time module and its function time.time. […]
Data Visualization with R, A New Online Book
Just recently wrote a post on 13 awesome Free Books to learn Data Science and R. And that did not last long. It is not just 13 anymore :). It is time to update the list of awesome data science books/resources available online freely. Claus Wilke, a professor from UT Austin has just announced a […]
5 Examples Using Dict Comprehension in Python
List Comprehension is a handy and faster way to create lists in Python in just a single line of code. It helps us write easy to read for loops in a single line. In Python, dictionary is a data structure to store data such that each element of the stored data is associated with a […]
How to Load a Massive File as small chunks in Pandas?
The longer you work in data science, the higher the chance that you might have to work with a really big file with thousands or millions of lines. Trying to load all the data at once in memory will not work as you will end up using all of your RAM and crash your computer. […]