LEFT ANTI Join is the opposite of semi-join. excluding the intersection, it returns the left table. It only returns the columns from the left table and not the right.
Method 1: Using isin()
On the created dataframes we perform left join and subset using isin() function to check if the part on which the datasets are merged is in the subset of the merged dataset.
Syntax:
DataFrame.isin(values)
Parameters:
- values: iterable, Series, DataFrame or dict
Returns:
DataFrame
Example:
In the below code, we used the indicator to find the rows which are ‘Left_only’ and subset the merged dataset, and assign it to df. finally, we retrieve the part which is only in our first data frame df1. the output is antijoin of the two data frames.
Python3
# importing packages import pandas as pd # anti-join # creating dataframes using pd.DataFrame() method. df1 = pd.DataFrame({ "city" : [ "new york" , "chicago" , "orlando" , 'mumbai' ], "temperature" : [ 21 , 14 , 35 , 30 ], "humidity" : [ 65 , 68 , 75 , 75 ], }) df2 = pd.DataFrame({ "city" : [ "chicago" , "new york" , "orlando" ], "humidity" : [ 67 , 60 , 70 ] }) # carrying out anti join using merge method df3 = df1.merge(df2, on = 'city' , how = 'left' , indicator = True ) df = df3.loc[df3[ '_merge' ] = = 'left_only' , 'city' ] d = df1[df1[ 'city' ].isin(df)] print (d) |
Output:
city temperature humidity 3 mumbai 30 75
Method 2: Using semi join
We can use the ‘~’ operator on the semi-join. It results in anti-join.
Semi-join: Similar to inner join, semi-join returns the intersection but it only returns the columns from the left table and not the right. it has no duplicate values.
Syntax:
[~df1[‘column_name’].isin(df2[‘column_name’])]
where,
- df1 is the first dataframe
- df2 is the second dataframe
- column_name is the matching column in both the dataframes
Example:
In this example, we merge df1 and df2 on ‘city’ by default it is ‘inner join’, after merging, We exclude the part of df1 which is in df3 and print out the resultant dataframe.
Python3
# code import pandas as pd # inverse of semi-join: # creating dataframes using pd.DataFrame() method. df1 = pd.DataFrame({ "city" : [ "new york" , "chicago" , "orlando" , 'mumbai' ], "temperature" : [ 21 , 14 , 35 , 30 ], "humidity" : [ 65 , 68 , 75 , 75 ], }) df2 = pd.DataFrame({ "city" : [ "chicago" , "new york" , "orlando" ], "humidity" : [ 67 , 60 , 70 ] }) # carrying out anti join using merge method df3 = df1.merge(df2, on = 'city' ) df = df1[~df1[ 'city' ].isin(df3[ 'city' ])] print (df) |
Output:
city temperature humidity 3 mumbai 30 75