Pandas Melt: Reshape Wide Data to Long/Tidy Data

Pandas offers multiple ways to reshape data in wide form to data in tidy or long form. Pandas melt() function is one of the powerful functions to use for reshaping dataframe with Python. In this case, we will see examples of basic use of Pandas melt to reshape wide data containing all numerical variables into tall data.

Let us load Pandas and NumPy. Let us also import poisson from scipy.stats.

import numpy as np
import pandas as pd
from scipy.stats import poisson

We will use scipy.stats to create random numbers from Poisson distribution. We will create three variables and use them to make a Pandas dataframe with wide data.

np.random.seed(seed=141)
c1= poisson.rvs(mu=10, size=3)
c2= poisson.rvs(mu=15, size=3)
c3= poisson.rvs(mu=20, size=3)

We use the three lists of random numbers to create Pandas dataframe in wide form.

df=pd.DataFrame({"C1":c1,
                 "C2":c2,
                 "C3":c3})

We have our data in wide form ready. Each column is a group name or variable name.

df

C1	C2	C3
0	15	19	22
1	12	15	13
2	14	16	24

Pandas Melt: A Simple Example

The simplest way to reshape wide data with all numerical columns to long/tidy form is to simply call melt() function on the dataframe.

In the example we use. And we get the reshaped data in long form with two columns. The first column is called “variable” by default and it contains the column/variable names. And the second column is named “value” and it contains the data from the wide form dataframe.

df.melt()

	variable	value
0	C1	15
1	C1	12
2	C1	14
3	C2	19
4	C2	15
5	C2	16
6	C3	22
7	C3	13
8	C3	24

Pandas Melt: Change Column Names

Let us see how to change the names of the columns of the tidy dataframe we get. To change the column name corresponding to variable name, we specify the argument “var_name” with the name we want.

df.melt(var_name=["Sample"]).head()

	Sample	value
0	C1	15
1	C1	12
2	C1	14
3	C2	19
4	C2	15

Similarly to specify a name for the values, we specify the argument “value_name” with the name we want.

df.melt(var_name="Sample", value_name="Count").head()

	Sample	Count
0	C1	15
1	C1	12
2	C1	14
3	C2	19
4	C2	15

This post is part of the series on Byte Size Pandas: Pandas 101, a tutorial covering tips and tricks on using Pandas for data munging and analysis.