Split a column in Pandas dataframe and get part of it

28 July 2024

1

When a part of any column in Dataframe is important and the need is to take it separate, we can split a column on the basis of the requirement.

We can use Pandas .str accessor, it does fast vectorized string operations for Series and Dataframes and returns a string object. Pandas str accessor has number of useful methods and one of them is str.split, it can be used with split to get the desired part of the string. To get the n^th part of the string, first split the column by delimiter and apply str[n-1] again on the object returned, i.e. Dataframe.columnName.str.split(" ").str[n-1].

Let’s make it clear by examples.

Code #1: Print a data object of the splitted column.

import pandas as pd
import numpy as np
df = pd.DataFrame({'Geek_ID':['Geek1_id', 'Geek2_id', 'Geek3_id', 
                                         'Geek4_id', 'Geek5_id'],
                'Geek_A': [1, 1, 3, 2, 4],
                'Geek_B': [1, 2, 3, 4, 6],
                'Geek_R': np.random.randn(5)})
  
# Geek_A  Geek_B   Geek_ID    Geek_R
# 0       1       1  Geek1_id    random number
# 1       1       2  Geek2_id    random number
# 2       3       3  Geek3_id    random number
# 3       2       4  Geek4_id    random number
# 4       4       6  Geek5_id    random number
  
print(df.Geek_ID.str.split('_').str[0])

Output:

0    Geek1
1    Geek2
2    Geek3
3    Geek4
4    Geek5
dtype: object

Code #2: Print a list of returned data object.

import pandas as pd
import numpy as np
df = pd.DataFrame({'Geek_ID':['Geek1_id', 'Geek2_id', 'Geek3_id',
                                         'Geek4_id', 'Geek5_id'],
                'Geek_A': [1, 1, 3, 2, 4],
                'Geek_B': [1, 2, 3, 4, 6],
                'Geek_R': np.random.randn(5)})
  
# Geek_A  Geek_B   Geek_ID    Geek_R
# 0       1       1  Geek1_id    random number
# 1       1       2  Geek2_id    random number
# 2       3       3  Geek3_id    random number
# 3       2       4  Geek4_id    random number
# 4       4       6  Geek5_id    random number
  
print(df.Geek_ID.str.split('_').str[0].tolist())

Output:

['Geek1', 'Geek2', 'Geek3', 'Geek4', 'Geek5']

Code #3: Print a list of elements.

import pandas as pd
import numpy as np
  
df = pd.DataFrame({'Geek_ID':['Geek1_id', 'Geek2_id', 'Geek3_id',
                                         'Geek4_id', 'Geek5_id'],
                'Geek_A': [1, 1, 3, 2, 4],
                'Geek_B': [1, 2, 3, 4, 6],
                'Geek_R': np.random.randn(5)})
  
# Geek_A  Geek_B   Geek_ID    Geek_R
# 0       1       1  Geek1_id    random number
# 1       1       2  Geek2_id    random number
# 2       3       3  Geek3_id    random number
# 3       2       4  Geek4_id    random number
# 4       4       6  Geek5_id    random number
  
print(df.Geek_ID.str.split('_').str[1].tolist())

Output:

['id', 'id', 'id', 'id', 'id']

Split a column in Pandas dataframe and get part of it

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

5 Best VPNs With Ad Blockers in 2025: 100% Tested by Tim Mocan

How to Watch IPTV From Anywhere in 2025: Full Guide by Gjurgjica Panova

Google’s Gemini AI will expand to your car, headphones, and watches soon

A $100 gift card is yours when you pick up the Pixel 9a — no trade-in needed

Recent Comments

EDITOR PICKS

5 Best VPNs With Ad Blockers in 2025: 100% Tested by Tim Mocan

How to Watch IPTV From Anywhere in 2025: Full Guide by Gjurgjica Panova

Google’s Gemini AI will expand to your car, headphones, and watches soon

POPULAR POSTS

5 Best VPNs With Ad Blockers in 2025: 100% Tested by Tim Mocan

How to Watch IPTV From Anywhere in 2025: Full Guide by Gjurgjica Panova

Google’s Gemini AI will expand to your car, headphones, and watches soon

POPULAR CATEGORY

ABOUT US

FOLLOW US