6 Free Books to Learn Python for Data Science

Python is one of the top/growing programming languages for doing data science. If you are interested in learning Data Science with Python, there are a number of fantastic books and resources available online for free from top data scientists.

Here is a list of best books for learning Python for data science. Current list contains 6 fantastic books. Get started learning Data Science with Python.

Automate the Boring Stuff with Python

Automate the Boring Stuff with Python

Automate the Boring Stuff with Python

is a great book for programming with Python for total beginners. Although it is a introductory Python book, but not data science book, the later chapters sets the path for data science. It covers common aspects data science like web data munging, pattern matching, web scraping, text extraction from pdf file.

The free version of the Python book is available online at https://automatetheboringstuff.com/.

Python Data Science Handbook

Python Data Science Handbook

Python Data Science Handbook: Essential Tools for Working with Data is one of the top books for learning to manipulate data, aka data wrangling and making data visualizations with Python. It does not teach basics of Python, you need to know a bit of programming with Python already.  Python Data Science Handbook covers the whole stack of data science tools available in Python, including NumPy, Pandas, Matplotlib and Machine Learning tool kit. If you are serious about using Python for Data Science this is a must book to have.

Jake VanderPlas, the author of the book and well known data scientist has made the book available for free. The free version of book is available at https://jakevdp.github.io/PythonDataScienceHandbook/ as Jupyter notebooks.

It is also worth to have the print edition of the Python Data Science Handbook. Most of the times you might find it for half its original price at Amazon.

Machine Learning with Python Cookbook

Machine Learning with Python Cookbook

Machine Learning with Python Cookbook is a fantastic recipe book for data scientists trying to use Machine Learning tools in Python. It has over 200 recipes to common challenges in Machine Learning for data scientists, with sample code chunks. The book covers a lot of topics in using Pandas, SciKitLearn and provides simple code snippets that actually works to solve specific challenges that one might encounter while using machine learning with Python.

Machine Learning with Python Cookbook is not freely available. However, the book came out of Chris Albon’s, the author of the book, fantastic website chrisalbon.com. The website’s “Technical Notes On Using  Data Science & Artificial Intelligence” with loads of recipes for common Machine Learning challenges in using Python.

Probabilistic Programming and Bayesian Methods for Hackers:

Bayesian Methods for Hackers

Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference is great book for anyone who wants to Bayesian analysis in Python to their data science tool kit. Traditionally, Bayesian analysis has been taught Math-first approach, but this book turns it around and teaches one to learn Bayesian inference with computing-first approach. The book showcases the use of PyMC3, the python library for Bayesian computing. Yes, PyMC3 is a great addition for practicing data scientists.

Bayesian Methods for Hackers is available online for free at http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/. The online version has all the chapters that one needs to learn Bayesian inference. To swetten the deal, recently the print version is available at amazon with additional chapters on Bayesian A/B testing and more.

 

The next two books from the fantastic “Think” series of books by Allen Downey.

Think Stats

Think Stats

Think Stats: Exploratory Data Analysis is a fantastic book for anyone interested in learning probability and statistics for doing data science. Think Stats uses exploratory data analysis as an anchor to learn probability and statistics.

EDA is probably the best way to learn probability and statistics for anyone doing data science and Think Stats offers you that for free at https://greenteapress.com/wp/think-stats-2e/.

Think Bayes: Bayesian Statistics in Python

Think Bayes

Think Bayes is a great free book from the Think series to learn Bayesian statistics with Python. As the book introduction says

If you know how to program with Python and also know a little about probability, you’re ready to tackle Bayesian statistics. With this book, you’ll learn how to solve statistical problems with Python code instead of mathematical notation, and use discrete probability distributions instead of continuous mathematics. Once you get the math out of the way, the Bayesian fundamentals will become clearer, and you’ll begin to apply these techniques to real-world problems.

And yes, like other books from Think series, Allen Downey has made this book available free online at http://greenteapress.com/wp/think-bayes/. But if are already a Data Scientist looking to add Bayesian computing to armour, show some love by getting the  print version Think Bayes: Bayesian Statistics in Python.