• Skip to secondary menu
  • Skip to main content
  • Skip to primary sidebar

Python and R Tips

Learn Data Science with Python and R

  • Home
  • Python
  • Pandas
    • Pandas 101
  • tidyverse
    • tidyverse 101
  • R
  • Linux
  • Conferences
  • Python Books
  • About
    • Privacy Policy
You are here: Home / Python / Pandas DataFrame / Pandas String Manipulation / String Manipulations in Pandas

String Manipulations in Pandas

November 4, 2018 by cmdlinetips

Python is known for its ability to manipulate strings. Pandas extends Python’s ability to do string manipulations on a data frame by offering a suit of most common string operations that are vectorized and are great for cleaning real world datasets.

Let us some simple examples of string manipulations in Pandas

# let us import pandas
import pandas as pd

Let us use gapminder dataframe from software carpentry website and load it as a Pandas data frame. Let us filter the data to make the dataframe smaller and compact using Pandas filtering functionalities.

gapminder_url='https://bit.ly/2cLzoxH'
gapminder = pd.read_csv(gapminder_url)
gapminder.head()
gapminder_ocean = gapminder[ (gapminder.year >2000) & (gapminder.continent== 'Oceania')]
gapminder_ocean.shape
gapminder_ocean

The resulting data frame gapminder_ocean contains data from just Australia and New Zealand.

How to Find Elements Starting with a Specific Letter?

We will use Pandas chaining operator “.” to combine multiple commands. Pandas’ str.startswith() will help find elements that starts with the pattern that we specify. For example to see, if there is any country starting with letter “T” in the data frame, we use

>gapminder_ocean.country.str.startswith('T')

This will result in a boolean True or False depending on if the element starts with T or not.

70      False
71      False
1102    False
1103    False
Name: country, dtype: bool

How to Check if an element contains a pattern in Pandas?

Similarly, we can use str.contains to check if a pattern is present in each element of a column in Pandas. We will get a boolean Series.

gapminder_ocean.country.str.contains('New')

How to Split Text of a column in Pandas?

We can use str.split to split a text in a column. To split a column, we use

gapminder_ocean.country.str.split()

We will get a list of tokens using the delimiter single “space”.

70         [Australia]
71         [Australia]
1102    [New, Zealand]
1103    [New, Zealand]

How to Find the length of each element of column?

We can use str.len to get the length of all the elements in a column.

gapminder_ocean.country.str.len()
70       9
71       9
1102    11
1103    11
Name: country, dtype: int64

How to Capitalize the First Letter of all elements in Column in Pandas?

We can use str.capitalize to capitalize the first letter.

gapminder_ocean.country.str.capitalize()

How to Capitalize the whole word of all elements in Column in Pandas?

We can use str.upper to capitalize all the letters of an element in column.

gapminder_ocean.country.str.upper()

How to Convert to Lower Case of the whole word of all elements in Column in Pandas?

We can use str.lower to convert to lower case.

gapminder_ocean.country.str.lower()

How to Check if all elements in a Column is Numeric in Pandas?

We can use str.isnumeric to check if an element is numeric or not. If it is numeric, we will get True, False otherwise.

gapminder_ocean.country.str.isnumeric()

Share this:

  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X

Related posts:

How To Drop Columns in Pandas?How To Drop One or More Columns in Pandas Dataframe? Default ThumbnailHow To Reset Index in Pandas Dataframe? Default ThumbnailHow to Get Frequency Counts of a Column in Pandas Dataframe: Pandas Tutorial Default ThumbnailHow to Get Unique Values from a Column in Pandas Data Frame?

Filed Under: Pandas String Manipulation, Python Tagged With: Pandas String Manipulation, Python Tips

Primary Sidebar

Subscribe to Python and R Tips and Learn Data Science

Learn Pandas in Python and Tidyverse in R

Tags

Altair Basic NumPy Book Review Data Science Data Science Books Data Science Resources Data Science Roundup Data Visualization Dimensionality Reduction Dropbox Dropbox Free Space Dropbox Tips Emacs Emacs Tips ggplot2 Linux Commands Linux Tips Mac Os X Tips Maximum Likelihood Estimation in R MLE in R NumPy Pandas Pandas 101 Pandas Dataframe Pandas Data Frame pandas groupby() Pandas select columns Pandas select_dtypes Python Python 3 Python Boxplot Python Tips R rstats R Tips Seaborn Seaborn Boxplot Seaborn Catplot Shell Scripting Sparse Matrix in Python tidy evaluation tidyverse tidyverse 101 Vim Vim Tips

RSS RSS

  • How to convert row names to a column in Pandas
  • How to resize an image with PyTorch
  • Fashion-MNIST data from PyTorch
  • Pandas case_when() with multiple examples
  • An Introduction to Statistical Learning: with Applications in Python Is Here
  • 10 Tips to customize ggplot2 title text
  • 8 Plot types with Matplotlib in Python
  • PCA on S&P 500 Stock Return Data
  • Linear Regression with Matrix Decomposition Methods
  • Numpy’s random choice() function

Copyright © 2025 · Lifestyle Pro on Genesis Framework · WordPress · Log in

Go to mobile version