In this post, we will learn about the differences between Numpy’s permutation() function and shuffle() function with examples.
Numpy offers a variety of functions to randomize or create random data. Numpy’s permutation() and shuffle() functions are two key functions that help randomize existing 1-D array or 2D-arrays.
First, we will start with how to use permutation and shuffle functions in Numpy and understand what they do. Then understand the key difference between permutation and shuffle functions – shuffle() shuffles the matrix in-place while permutation() creates a new array or matrix.
Let us load numpy.
import numpy as np
We will use 2D array of dimension 3×4 to illustrate the use and the differences between Numpy’s permutation and shuffle() functions.
Here we create a 2-D array with sequence of numbers from 0 to 11 using np.arange() function and np.reshape() function in Numpy.
x = np.arange(12).reshape(4, 6) x array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
We will using permutation function and shuffle function using Numpy’s Random Generator class. So, let us first create the generator object using random module’s default_rng() function with a seed.
rng = np.random.default_rng(42)
Numpy’s Permutation function()
Now let us go ahead and use permutation function on our 2-D array. We use permutation() function with the argument axis=0, which rearranges the rows of the array as shown below.
rng.permutation(x, axis=0) array([[ 8, 9, 10, 11], [ 4, 5, 6, 7], [ 0, 1, 2, 3]])
Taking a closer look we can find that, after applying permutation() function, the first row in the original matrix is now the third row and the order of first row’s elements in the original matrix is intact in the third row after permuting. As expected, the third row in the original matrix is now the first row after permuting. Basically all the rows are permuted in “bulk”.
To understand how permutation() function works, we apply the function on our input matrix a couple of times. IN the second example of permutation, the first row after permutation is the same as the original matrix. The location of second and third row is swapped.
rng.permutation(x, axis=0) array([[ 0, 1, 2, 3], [ 8, 9, 10, 11], [ 4, 5, 6, 7]])
Numpy’s shuffle function()
Let us use the same 3×4 matrix (2-D array) as input to shuffle() function as well. Out input matrix x looks like this.
x array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
Numpy’s shuffle function can also take the axis we want to shuffle by. Here we shuffle x by rows as before with axis=0 argument.
rng.shuffle(x, axis=0)
A big thing to notice is that Numpy’s shuffle() is not giving out any result to print. This is because shuffle() performs shuffle by row operation in-place. Therefore the original x matrix now contains the matrix after shuffle. This may be more efficient if we deal with large matrices.
By printing x we can see that the original matrix is not there any more. We also learn that shuffle() behaves the same way by shuffling rows by “bulk” as permutation.
x array([[ 0, 1, 2, 3], [ 8, 9, 10, 11], [ 4, 5, 6, 7]])