Friday, December 27, 2024
Google search engine
HomeLanguagesHow to Fix SettingWithCopyWarning in Pandas

How to Fix SettingWithCopyWarning in Pandas

The SettingWithCopyWarning may occur when we are trying to modify the data in the Pandas DataFrame. This warning is thrown when we write a line of code with getting and set operations. 

To explain this in detail, Using get operation, Pandas won’t guarantee that the returned result from getting operation is either a View or Copy. If it returned a view then set operation would affect the original DataFrame. If it returned a copy then it would modify the copy but the original DataFrame remains unchanged. So here we are not sure whether changes happened to DataFrame or not.

DataFrame

Student Name

Percentage

Grade

Akhil

Sai

65

C

Rohit

90

O

Prasanth

79

B

Divya

89

A

Code that throws a warning 

Python3




# import necessary packages
import pandas as pd
 
# create a dataframe
marks = pd.DataFrame({'Name': ['Akhil', 'Sai', 'Rohit', 'Prasanth', 'Divya'],
                      'Percentage': ['-', 65, 90, 79, 89],
                      'Grade': ['-', 'C', 'O', 'B', 'A']})
 
# Assign Absent if percentage is not specified
marks[marks.Percentage == '-'].Grade = 'Ab'


Output

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py:5303: SettingWithCopyWarning: 

A value is trying to be set on a copy of a slice from a DataFrame.

Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  self[name] = value

Solution

To solve this problem instead of slicing while getting the required data use the loc method to get required rows and columns. And also use the copy method to store a copy of DataFrame in another variable such that we can separate the get and set operation into 2 lines.

Example 1:

Use the above DataFrame and loc method while getting the required rows & columns in getting an operation.

Python3




# import necessary packages
import pandas as pd
 
# create a dataframe
marks = pd.DataFrame({'Name': ['Akhil', 'Sai', 'Rohit', 'Prasanth', 'Divya'],
                      'Percentage': ['-', 65, 90, 79, 89],
                      'Grade': ['-', 'C', 'O', 'B', 'A']})
 
# Assign Absent if percentage is not specified
marks.loc[marks.Percentage == '-', 'Grade'] = 'Ab'
 
# modified content
marks


Output

Explanation- Instead of slicing which is throwing warning, Here we used the loc method.

Note: But this loc method doesn’t ensure a 100% guarantee on warning-free output. So it is advised to create a copy of the original Data Frame and make modifications to that.

Example 2:

Create a copy of a DataFrame and make changes to it by using loc & copy methods.

Python3




# import necessary packages
import pandas as pd
 
# create a dataframe
marks = pd.DataFrame({'Name': ['Akhil', 'Sai', 'Rohit', 'Prasanth', 'Divya'],
                      'Percentage': ['-', '-', 90, 79, 89],
                      'Grade': ['-', '-', 'O', 'B', 'A']})
 
# create a copy of original DataFrame whose
$ percentage is empty(absenties)
Absent_Students = marks.loc[marks.Percentage == '-', :].copy()
 
# Make their grade as 'Ab' which indicates absent.
Absent_Students.Grade = 'Ab'
 
# modified content
Absent_Students


Output

Explanation- The above Absent_Students is a copy of the original DataFrame with only Absent students. It ensures that we are changing modifications only on the copy of a DataFrame. So it removes confusion between view & copy.

RELATED ARTICLES

Most Popular

Recent Comments