Pandas pct_change() function is a handy function that lets us calculate percent change between two rows or two columns easily. By default, pct_change() function works with adjacent rows and columns, but it can compute percent change for user defined period as well.
One of the useful features of Pandas pct_change is to add annotation with multiple colors using Pandas style method. In this tutorial, we will learn how to add colors to results obtained from using Pandas pct_change() function to compute percent change between rows.
First, let us load Pandas.
import pandas as pd
We will create a simple dataframe using three tech companies’ revenue over multiple years.
year=[2017, 2018, 2019, 2020] facebook =[15934, 22112, 18485, 29146] google= [12662, 30736, None, 40269] microsoft= [25489, 16571, 39240, 44281]
Our input data is stored in multiple lists and we can convert the lists into a dataframe using Pandas’ DataFrame() function.
df = pd.DataFrame({"facebook":facebook, "google": google, "microsoft": microsoft}, index=year)
In the example dataframe columns are companies and rows are the years.
df facebook google microsoft 2017 15934 12662.0 25489 2018 22112 30736.0 16571 2019 18485 NaN 39240 2020 29146 40269.0 44281
If we want to compute the change in revenue over time in terms of percent, We can use Pandas’ pct_change() function on the dataframe. By default, Pandas pct-change() function computes percent change for every row by comparing it with the previous row. That is why in the results, the first row values are NaNs.
In our example, we get percent change in revenue for every year.
df.pct_change() facebook google microsoft 2017 NaN NaN NaN 2018 0.387724 1.427421 -0.349876 2019 -0.164029 0.000000 1.367992 2020 0.576738 0.310157 0.128466
Add Percentage Sign in Pandas
We can add percentage symbol to the results from pct_change() using style method and specify the format that we would like to have.
df.pct_change().style.format("{:.2%}") facebook google microsoft 2017 nan% nan% nan% 2018 38.77% 142.74% -34.99% 2019 -16.40% 0.00% 136.80% 2020 57.67% 31.02% 12.85%
Note that the row with nan values also have percentage sign and that does not make sense. We can change the values of nan using “na_rep” to format() function’s argument. Now we get “-” dashes instead of nan with percentage symbol.
df.pct_change().style.format("{:.2%}", na_rep="-") facebook google microsoft 2017 - - - 2018 38.77% 142.74% -34.99% 2019 -16.40% 0.00% 136.80% 2020 57.67% 31.02% 12.85%
Annotate Maximum Values in a Column with colors in Pandas
To highlight maximumum values We can highlight maximum values in each column, we can use highlight_max() function after converting to percent change using using chain operator.
(df. pct_change(). style. highlight_max(). format("{:.2%}", na_rep="-"))
Note the difference in the way we chained multiple functions. When you combining multiple operations, writing each operation in a separate line as here makes it easy to read the code and understand.
By default, highlight_max() function annotates the maximum values in each column in yellow color.
We can also specify the color with which we would like to highlight maximum value using color argument to highlight_max() function.
(df. pct_change(). style. highlight_max(color="lightgreen"). format("{:.2%}", na_rep="-"))
Annotate Maximum and Minimum Values in a Column with colors in Pandas
Let us highlight both maximum value and minimum value in each column with two different colors using highlight_max/highlight_min functions.
(df. pct_change(). style. highlight_max(color="lightgreen"). highlight_min(color="yellow"). format("{:.2%}", na_rep="-"))