Sunday, December 29, 2024
Google search engine
HomeLanguagesPython | Pandas Index.duplicated()

Python | Pandas Index.duplicated()

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

Pandas Index.duplicated() function returns Index object with the duplicate values remove. Duplicated values are indicated as True values in the resulting array. Either all duplicates, all except the first, or all except the last occurrence of duplicates can be indicated.

Syntax: Index.duplicated(keep=’first’)

Parameters :
keep : {‘first’, ‘last’, False}, default ‘first’
The value or values in a set of duplicates to mark as missing.
-> ‘first’ : Mark duplicates as True except for the first occurrence.
-> ‘last’ : Mark duplicates as True except for the last occurrence.
-> False : Mark all duplicates as True.

Returns : numpy.ndarray

Example #1: Use Index.duplicated() function to indicate all the duplicated value in the Index except the first one.




# importing pandas as pd
import pandas as pd
  
# Creating the Index
idx = pd.Index(['Labrador', 'Beagle', 'Labrador'
                      'Lhasa', 'Husky', 'Beagle'])
  
# Print the Index
idx


Output :

Let’s find if a value present in Index is a duplicate value or unique.




# Identify the duplicated values except the first
idx.duplicated(keep ='first')


Output :

As we can see in the output, the Index.duplicated() function has marked all the occurrence of duplicate value as True except the first occurrence.
 
Example #2: Use Index.duplicated() function to identify all the duplicate values. here all the duplicate values will be marked as True




# importing pandas as pd
import pandas as pd
  
# Creating the Index
idx = pd.Index([100, 50, 45, 100, 12, 50, None])
  
# Print the Index
idx


Output :

Let’s identify all the duplicated values in the Index.

Note : We are having NaN values in the Index.




# Identify all duplicated occurrence of values
idx.duplicated(keep = False)


Output :

The function has marked all the duplicate value as True. It has also treated the single occurrence of NaN value as unique and has marked it false.

RELATED ARTICLES

Most Popular

Recent Comments