Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas str.partition()
works in a similar way like str.split()
. Instead of splitting the string at every occurrence of separator/delimiter, it splits the string only at the first occurrence. In the split function, the separator is not stored anywhere, only the text around it is stored in a new list/Dataframe. But in the str.partition()
method, the separator is also stored.
.str has to be prefixed every time before calling this method to differentiate it from the Python’s default function otherwise, it will throw an error.
Syntax: Series.str.partition(pat=’ ‘, expand=True)
Parameters:
pat: String value, separator or delimiter to separate string at. Default is ‘ ‘ (whitespace)
expand: Boolean value, returns a data frame with different value in different columns if True. Else it returns a series with list of strings. Default is True.Return Type: Series of list or Data frame depending on expand Parameter
To download the CSV used in code, click here.
In the following examples, the data frame used contains data of some employees. The image of data frame before any operations is attached below.
Example #1: Splitting String into List
In this example, the Name column is splitted at the first occurrence of ‘, ‘. The expand parameter is kept False as to expand it into a list instead of Data Frame.
# importing pandas module import pandas as pd # making data frame # removing null values if any to avoid errors data.dropna(how = 'all' , inplace = True ) # displaying top 5 rows of data data.head() # splitting at ', ' into list data[ "Name" ] = data[ "Name" ]. str .partition( ", " , False ) # display data |
Output:
As shown in the output image, the Name column was splitted into list at first occurrence of ‘, ‘. As it can be seen, ‘, ‘ is also stored as an separate element of list.
Note: Do not get confused by two commas in the list, one is element and the other is element separator.
Example #2: Splitting String into Data frame
In this example, the First Name and Last name is separated from the Name column and stored into separate columns in the data frame.
# importing pandas module import pandas as pd # making data frame # removing null values if any to avoid errors data.dropna(how = 'all' , inplace = True ) # displaying top 5 rows of data data.head() # splitting at ', ' into Data frame new = data[ "Name" ]. str .partition( ", " , True ) # making separate first name column from new data frame data[ "First Name" ] = new[ 2 ] # making separate last name column from new data frame data[ "Last Name" ] = new[ 0 ] # Dropping old Name columns data.drop(columns = [ "Name" ], inplace = True ) # df display data |
Output:
As shown in the output image, the Name column was separated into a data frame with 3 columns(one of string before comma, and string after comma). After that data frame was used to create new columns in the same data frame. Old Name column was dropped using .drop() method.
New Data frame-
Data frame with Added columns-