Creating views on Pandas DataFrame | Set – 2

22 July 2024

1

Prerequisite: Creating views on Pandas DataFrame | Set – 1

Many times while doing data analysis we are dealing with a large data set has a lot of attributes. All the attributes are not necessarily equally important. As a result, we want to work with only a set of columns in the dataframe. For that purpose, let’s see how we can create views on the Dataframe and select only those columns that we need and leave the rest.

Given a Dataframe containing nba data, create views on it such that only desired columns are included.

Note : For link to the CSV file used in the code, click here

Solution #1: While reading the data from the csv file into Python, We can select all those columns that we want to read into the DataFrame.

# importing pandas as pd 
import pandas as pd 
  
# list of columns that we want to 
# read into the DataFrame 
use_cols =['Name', 'Number', 'College'] 
  
# Reading the csv file 
df = pd.read_csv('nba.csv', usecols = lambda x : x in use_cols, 
                                             index_col = False) 
  
# Print the dataframe 
print(df) 

Output :

Solution #2 : While reading the data from the csv file into Python, we can list all those columns that we do not want to read into the DataFrame. It is like dropping those columns.

# importing pandas as pd 
import pandas as pd 
  
# list of columns that we do not want 
# to read into the DataFrame 
skip_cols =['Name', 'Number', 'College'] 
  
# Reading the csv file 
df = pd.read_csv('nba.csv', usecols = lambda x : x not in skip_cols, 
                                                  index_col = False) 
  
# Print the dataframe 
print(df) 

Output :

Solution #3 : We can use the difference() method to drop the columns that we do not need.

# importing pandas as pd 
import pandas as pd 
  
# Reading the csv file 
df = pd.read_csv("nba.csv") 
  
# Print the dataframe 
print(df) 

Output :

Now we will drop those columns which we do not need by using the difference() method.

# Drop the listed columns 
df_view = df[df.columns.difference(['Position', 'Age', 'Salary'])] 
  
# Print the new DataFrame 
print(df_view) 

Output :

Creating views on Pandas DataFrame | Set – 2

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

Interview With Willem Dewulf – CEO of ProBackup by Shauli Zacks

Recent Comments

EDITOR PICKS

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

POPULAR POSTS

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

POPULAR CATEGORY

ABOUT US

FOLLOW US