Python 3 Guide for Data Scientists

Guide to Migrate to Python 3

In case you missed it, there won’t be any support Python 2 by 2020. The last Python 2 update was for Python 2.7. So if you are interested in Data Science and learning Python, start with Python 3. If you already program with Python 2, it is time to migrate to Python 3.

Alex Rogozhnikov, has a really nice guide to migrate to Python 3 smoothly without pain. Check out the “A short guide on features of Python 3 for data scientists” here https://github.com/arogozhnikov/python3_with pleasure.

The three cool features of Python 3 that I have been using frequently and love are

1. Merging dictionaries in Python 3

Merging dictionaries can be done easily in Python 3.5 and above. For example, the two dictionaries

x = dict(a=1, b=2)
y = dict(b=3, d=4)

can be easily merged in Python 3.5+ with

# Python 3.5+
z = {**x, **y} 

2. globbing all files recursively with pathlib

import pathlib 
pdf_files= pathlib.Path('/path/').glob('**/*.pdf')

It returns a generator and one can iterate through to get all pdf files in the path and all the directories inside recursively.

3. Unpacking it cool

Python 3 lets you unpack any iterables with the *variable syntax. This way you don’t have to write out all items in the iterable. For example, we can do

>first, second, *rest = range(10)
>first
 0
>second
 1
>rest
[2, 3, 4, 5, 6, 7, 8, 9]

Unpacking can happen in the middle or last, for example

>first, *middle, last = range(10)
>first
 0
>last
 9
>middle
[1, 3, 4, 5, 6, 7, 8]