Python | Pandas Index.duplicated()

27 July 2024

1

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas Index.duplicated() function returns Index object with the duplicate values remove. Duplicated values are indicated as True values in the resulting array. Either all duplicates, all except the first, or all except the last occurrence of duplicates can be indicated.

Syntax: Index.duplicated(keep=’first’)

Parameters :
keep : {‘first’, ‘last’, False}, default ‘first’
The value or values in a set of duplicates to mark as missing.
-> ‘first’ : Mark duplicates as True except for the first occurrence.
-> ‘last’ : Mark duplicates as True except for the last occurrence.
-> False : Mark all duplicates as True.

Returns : numpy.ndarray

Example #1: Use Index.duplicated() function to indicate all the duplicated value in the Index except the first one.

# importing pandas as pd 
import pandas as pd 
  
# Creating the Index 
idx = pd.Index(['Labrador', 'Beagle', 'Labrador',  
                      'Lhasa', 'Husky', 'Beagle']) 
  
# Print the Index 
idx 

Output :

Let’s find if a value present in Index is a duplicate value or unique.

# Identify the duplicated values except the first 
idx.duplicated(keep ='first') 

Output :

As we can see in the output, the Index.duplicated() function has marked all the occurrence of duplicate value as True except the first occurrence.

Example #2: Use Index.duplicated() function to identify all the duplicate values. here all the duplicate values will be marked as True

# importing pandas as pd 
import pandas as pd 
  
# Creating the Index 
idx = pd.Index([100, 50, 45, 100, 12, 50, None]) 
  
# Print the Index 
idx 

Output :

Let’s identify all the duplicated values in the Index.

Note : We are having NaN values in the Index.

# Identify all duplicated occurrence of values 
idx.duplicated(keep = False) 

Output :

The function has marked all the duplicate value as True. It has also treated the single occurrence of NaN value as unique and has marked it false.

Python | Pandas Index.duplicated()

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

How to Secure Your Network-Attached Storage (NAS) in 2024 by Tyler Cross

8 Best Private Search Engines in 2024: Tested by Experts by Tyler Cross

The biggest comeback in tech history [Video]

Google wants to hear your thoughts on the Android 15 QPR2 Beta

Recent Comments

EDITOR PICKS

How to Secure Your Network-Attached Storage (NAS) in 2024 by Tyler Cross

8 Best Private Search Engines in 2024: Tested by Experts by Tyler Cross

The biggest comeback in tech history [Video]

POPULAR POSTS

How to Secure Your Network-Attached Storage (NAS) in 2024 by Tyler Cross

8 Best Private Search Engines in 2024: Tested by Experts by Tyler Cross

The biggest comeback in tech history [Video]

POPULAR CATEGORY

ABOUT US

FOLLOW US