• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / Pandas 101 / How To Add Identifier Column When Concatenating Pandas data frames?

How To Add Identifier Column When Concatenating Pandas data frames?

May 22, 2020 by cmdlinetips

Pandas concat() function is great for concating two data frames or appending one dataframe to another with same columns. Sometimes, you might want to keep an identifier for each appended dataframe. In this post, we will see an example of how to concat two dataframes with an identifier.

Let us import Pandas and numpy to create some data and two dataframes.

import pandas as pd
import numpy as np

Let us create two dataframes from scratch. We use Numpy’s random module to create some data and assign row name and column names

df1 = pd.DataFrame(np.random.randint(20, size=(2,3)),
                  index=list('ij'),
                   columns=list('ABC'))
df1


        A	B	C
i	2	9	4
j	18	13	18

Here we create our second dataframe using Pandas DataFrame() function.

df2 = pd.DataFrame(np.random.randint(20, size=(2,3)),
                  index=list('mn'),
                   columns=list('ABC'))
df2
        A	B	C
m	5	16	9
n	11	0	9

We can perform row binding of two dataframes with Pandas concat() function. To add an identifier for each dataframe, we need to specify the identifiers as a list for the argument “keys” in Pandas concat() function.

pd.concat([df1,df2],keys=['t1', 't2'])

It creates new multi-indexed Pandas dataframe with two dataframes concatenated. One of the row indexes is row index from input dataframe and the other row index is the identifier we added.

                A	B	C
t1	i	2	9	4
        j	18	13	18
t2	m	5	16	9
        n	11	0	9

We can use Pandas reset_index() function to convert the multiindex dataframe to regular dataframe

pd.concat([df1,df2], keys=['t1', 't2']).reset_index()

Pandas’ reset_index() automatically adds column names for the new columns created from the row names.

	level_0	level_1	A	B	C
0	t1	i	2	9	4
1	t1	j	18	13	18
2	t2	m	5	16	9
3	t2	n	11	0	9

This post is part of the series on Pandas 101, a tutorial covering tips and tricks on using Pandas for data munging and analysis.

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

Default ThumbnailHow To Concatenate Two or More Pandas DataFrames? Default ThumbnailHow To Compare Two Dataframes with Pandas compare? Default ThumbnailHow to Drop Rows Based on a Column Value in Pandas Dataframe? Default ThumbnailHow To Add a New Column Using a Dictionary in Pandas Data Frame ?

Filed Under: Pandas 101 Tagged With: Pandas 101

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version