The in and not in operators can be used with Pandas DataFrames to check if a given value or set of values is present in the DataFrame or not using Python. The in-operator returns a boolean value indicating whether the specified value is present in the DataFrame, while the not-in-operator returns a boolean value indicating whether the specified value is not present in the DataFrame.
This operator can be used with the .query() method of a Pandas DataFrame to filter the DataFrame based on a given set of values. The .query() method takes a string containing a Boolean expression as input and returns a new DataFrame containing only the rows that satisfy the given expression.
Filter Pandas Dataframe in Python using ‘in’ keyword
The in keyword has two purposes, first to check if a value is present in a list, tuple, range, string, etc. and another is to iterate through a sequence in a for a loop.
Example 1
Here is an example of how the in operator can be used with the .query() method to filter a DataFrame:
Python3
import pandas as pd # Create a DataFrame with some sample data df = pd.DataFrame({ "A" : [ 1 , 2 , 3 , 4 ], "B" : [ 5 , 6 , 7 , 8 ]}) # Filter the DataFrame to include only rows # where column "A" has a value of 1 or 2 df = df.query( "A in [1, 2]" ) # Print the resulting DataFrame print (df) |
Output:
A B 0 1 5 1 2 6
Example 2
The in operator can also be used in more complex expressions with the .query() method, to combine multiple conditions and apply logical operators such as and/or.
Python3
# Filter the DataFrame to include only rows # where column "A" has a value of 1 or 2, and # column "B" has a value of 6 or 7 df = df.query( "A in [1, 2] and B in [6, 7]" ) # Print the resulting DataFrame print (df) |
Output:
A B 1 2 6
Filter Pandas Dataframe in Python using the ‘not in’ keyword
Python not keyword is a logical operator which is usually used for figuring out the negation or opposite boolean value of the operand.
Example 1
To use the `not in` operator with the .query() method of a Pandas DataFrame, you can simply negate the expression using the not keyword.
Python3
import pandas as pd # Create a DataFrame with some sample data df = pd.DataFrame({ "A" : [ 1 , 2 , 3 , 4 ], "B" : [ 5 , 6 , 7 , 8 ]}) # Filter the DataFrame to exclude rows # where column "A" has a value of 1 or 2 df = df.query( "not A in [1, 2]" ) # Print the resulting DataFrame print (df) |
Output:
A B 2 3 7 3 4 8
Example 2
Here is the example with the ‘not in’ operator.
Python3
# Filter the DataFrame to exclude rows # where column "A" has a value of 1 or 2, and # column "B" has a value of 6 or 7 df = df.query( "not (A in [1, 2] and B in [6, 7])" ) # Print the resulting DataFrame print (df) |
Output:
A B 0 1 5 2 3 7 3 4 8