Thursday, December 26, 2024
Google search engine
HomeLanguagesGet the substring of the column in Pandas-Python

Get the substring of the column in Pandas-Python

Now, we’ll see how we can get the substring for all the values of a column in a Pandas dataframe. This extraction can be very useful when working with data. For example, we have the first name and last name of different people in a column and we need to extract the first 3 letters of their name to create their username.

Example 1:
We can loop through the range of the column and calculate the substring for each value in the column.




# importing pandas as pd
import pandas as pd 
  
# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington"
                "Rosie Bates", "Emily Edward"]}
  
# converting the dictionary to a
# dataframe
df = pd.DataFrame.from_dict(dict)
  
# storing first 3 letters of name
for i in range(0, len(df)):
    df.iloc[i].Name = df.iloc[i].Name[:3]
  
df


Output:

pandas-extract-substring-1

Note: For more information, refer Python Extracting Rows Using Pandas

Example 2: In this example we’ll use str.slice().




# importing pandas as pd
import pandas as pd 
  
# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington",
                "Rosie Bates", "Emily Edward"]}
  
# converting the dictionary to a 
# dataframe
df = pd.DataFrame.from_dict(dict)
  
# storing first 3 letters of name as username
df['UserName'] = df['Name'].str.slice(0, 3)
  
df


Output:

pandas-extract-2

Example 3: We can also use the str accessor in a different way by using square brackets.




# importing pandas as pd
import pandas as pd 
  
# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington"
                "Rosie Bates", "Emily Edward"]}
  
# converting the dictionary to a dataframe
df = pd.DataFrame.from_dict(dict)
  
# storing first 3 letters of name as username
df['UserName'] = df['Name'].str[:3]
  
df


Output:

pandas-extract-21

Example 4: We can also use str.extract for this task. In this example we’ll store last name of each person in “LastName” column.




# importing pandas as pd
import pandas as pd 
  
# creating a dictionary
dict = {'Name':["John Smith", "Mark Wellington",
                "Rosie Bates", "Emily Edward"]}
  
# converting the dictionary to a dataframe
df = pd.DataFrame.from_dict(dict)
  
# storing lastname of each person
df['LastName'] = df.Name.str.extract(r'\b(\w+)$'
                                     expand = True)
  
df


Output:

pandas-extract-substring-2

RELATED ARTICLES

Most Popular

Recent Comments