• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / Python / Python Tips / How to Select Columns/Rows by substring match in Pandas

How to Select Columns/Rows by substring match in Pandas

May 24, 2022 by cmdlinetips

In this post, we will learn how to select columns of a Pandas dataframe or a rows of a dataframe based on substring match in Pandas. We will use Pandas filter() function with argument “like” to select columns/rows, whose names partially match with a string of interest.

Let us load the necessary modules. We are importing seaborn in addition to Pandas to use its built in datasets to illustrate the column/row selection by substring match.

import seaborn as sns
import pandas as pd

We use palmer penguin dataset and load it as a dataframe. For this toy example, we also subset the dataframe using pandas sample() function.

# load penguins data from Seaborn's built in datasets
penguins = sns.load_dataset("penguins")

# random sample of 6 rows using Pandas sample() function
df = penguins.sample(6)

Our toy dataframe looks like this with 7 columns and row index.

df

       species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex
283	Gentoo	Biscoe	54.3	15.7	231.0	5650.0	Male
198	Chinstrap	Dream	50.1	17.9	190.0	3400.0	Female
25	Adelie	Biscoe	35.3	18.9	187.0	3800.0	Female
329	Gentoo	Biscoe	48.1	15.1	209.0	5500.0	Male
338	Gentoo	Biscoe	47.2	13.7	214.0	4925.0	Female
208	Chinstrap	Dream	45.2	16.6	191.0	3250.0	Female

To select columns, whose column name match with a substring, like “len” in the example below, we use Pandas filter function with argument “like“. We specify the substring that we want to match as the value for “like” as shown below. And this filters columns with matching substring. In the example, below we have two columns with matching substring “len”.

df.filter(like="len", axis=1)


     bill_length_mm	flipper_length_mm
283	54.3	231.0
198	50.1	190.0
25	35.3	187.0
329	48.1	209.0
338	47.2	214.0
208	45.2	191.0

Here is another example, where there is only one column, whose column name has a matching substring.

df.filter(like="lan", axis=1)


        island
283	Biscoe
198	Dream
25	Biscoe
329	Biscoe
338	Biscoe
208	Dream

We can also use filter() function with like argument to select matching substrings in row indices. In this example below we use axis=0 to specify we are filtering rows, not columns, based on the substring match to the row names.

df.filter(like="3", axis=0)

	species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex
283	Gentoo	Biscoe	54.3	15.7	231.0	5650.0	Male
329	Gentoo	Biscoe	48.1	15.1	209.0	5500.0	Male
338	Gentoo	Biscoe	47.2	13.7	214.0	4925.0	Female

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

Default ThumbnailPandas filter(): Select Columns and Rows by Labels in a Dataframe Default Thumbnaildplyr filter(): Filter/Select Rows based on conditions Default ThumbnailHow To Delete Rows in Pandas Dataframe dplyr select(): How to Select Columns?dplyr select(): Select one or more variables from a dataframe

Filed Under: Pandas 101, Python Tips Tagged With: Pandas select based on substring match

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version