Friday, December 27, 2024
Google search engine
HomeLanguagesPandas Read CSV in Python

Pandas Read CSV in Python

CSV files are the Comma Separated Files. To access data from the CSV file, we require a function read_csv() from Pandas that retrieves data in the form of the data frame.

Syntax of read_csv() 

Here is the Pandas read CSV syntax with its parameters.

Syntax: pd.read_csv(filepath_or_buffer, sep=’ ,’ , header=’infer’,  index_col=None, usecols=None, engine=None, skiprows=None, nrows=None) 

Parameters: 

  • filepath_or_buffer: Location of the csv file. It accepts any string path or URL of the file.
  • sep: It stands for separator, default is ‘, ‘.
  • header: It accepts int, a list of int, row numbers to use as the column names, and the start of the data. If no names are passed, i.e., header=None, then, it will display the first column as 0, the second as 1, and so on.
  • usecols: Retrieves only selected columns from the CSV file.
  • nrows: Number of rows to be displayed from the dataset.
  • index_col: If None, there are no index numbers displayed along with records.  
  • skiprows: Skips passed rows in the new data frame.

Read CSV File using Pandas read_csv

Before using this function, we must import the Pandas library, we will load the CSV file using Pandas.

PYTHON3




# Import pandas
import pandas as pd
 
# reading csv file
df = pd.read_csv("people.csv")
print(df.head())


Output:

  First Name Last Name     Sex                       Email Date of birth  Job Title        
0 Shelby Terrell Male elijah57@example.net 1945-10-26 Games developer
1 Phillip Summers Female bethany14@example.com 1910-03-24 Phytotherapist
2 Kristine Travis Male bthompson@example.com 1992-07-02 Homeopath
3 Yesenia Martinez Male kaitlinkaiser@example.com 2017-08-03 Market researcher
4 Lori Todd Male buchananmanuel@example.net 1938-12-01 Veterinary surgeon

Using sep in read_csv()

In this example, we will take a CSV file and then add some special characters to see how the sep parameter works.

Python3




# sample = "totalbill_tip, sex:smoker, day_time, size
# 16.99, 1.01:Female|No, Sun, Dinner, 2
# 10.34, 1.66, Male, No|Sun:Dinner, 3
# 21.01:3.5_Male, No:Sun, Dinner, 3
#23.68, 3.31, Male|No, Sun_Dinner, 2
# 24.59:3.61, Female_No, Sun, Dinner, 4
# 25.29, 4.71|Male, No:Sun, Dinner, 4"
 
# Importing pandas library
import pandas as pd
 
# Load the data of csv
df = pd.read_csv('sample.csv',
                 sep='[:, |_]',
                 engine='python')
 
# Print the Dataframe
print(df)


Output:

        totalbill   tip Unnamed: 2   sex smoker Unnamed: 5     day    time  Unnamed: 8  size 
16.99 NaN 1.01 Female No NaN Sun NaN Dinner NaN 2
10.34 NaN 1.66 NaN Male NaN No Sun Dinner NaN 3
21.01 3.50 Male NaN No Sun NaN Dinner NaN 3.0 None
23.68 NaN 3.31 NaN Male No NaN Sun Dinner NaN 2
24.59 3.61 NaN Female No NaN Sun NaN Dinner NaN 2
25.29 NaN 4.71 Male NaN No Sun NaN Dinner NaN 4

Using usecols in read_csv()

Here, we are specifying only 3 columns,i.e.[“First Name”, “Sex”, “Email”] to load and we use the header 0 as its default header.

Python3




df = pd.read_csv('people.csv',
        header=0,
        usecols=["First Name", "Sex", "Email"])
# printing dataframe
print(df.head())


Output:

  First Name     Sex                       Email
0 Shelby Male elijah57@example.net
1 Phillip Female bethany14@example.com
2 Kristine Male bthompson@example.com
3 Yesenia Male kaitlinkaiser@example.com
4 Lori Male buchananmanuel@example.net

Using index_col in read_csv()

Here, we use the “Sex” index first and then the “Job Title” index, we can simply reindex the header with index_col parameter.

Python3




df = pd.read_csv('people.csv',
        header=0,
        index_col=["Sex", "Job Title"],
        usecols=["Sex", "Job Title", "Email"])
 
print(df.head())


Output:

                                                Email
Sex Job Title
Male Games developer elijah57@example.net
Female Phytotherapist bethany14@example.com
Male Homeopath bthompson@example.com
Market researcher kaitlinkaiser@example.com
Veterinary surgeon buchananmanuel@example.net

Using nrows in read_csv()

Here, we just display only 5 rows using nrows parameter.

Python3




df = pd.read_csv('people.csv',
        header=0,
        index_col=["Sex", "Job Title"],
        usecols=["Sex", "Job Title", "Email"],
                nrows=3)
 
print(df)


Output:

                                        Email
Sex Job Title
Male Games developer elijah57@example.net
Female Phytotherapist bethany14@example.com
Male Homeopath bthompson@example.com

Using skiprows in read_csv()

The skiprows help to skip some rows in CSV, i.e, here you will observe that the rows mentioned in skiprows have been skipped from the original dataset.

Python3




df= pd.read_csv("people.csv")
print("Previous Dataset: ")
print(df)
# using skiprows
df = pd.read_csv("people.csv", skiprows = [1,5])
print("Dataset After skipping rows: ")
print(df)


Output:

Previous Dataset:
First Name Last Name Sex Email Date of birth Job Title
0 Shelby Terrell Male elijah57@example.net 1945-10-26 Games developer
1 Phillip Summers Female bethany14@example.com 1910-03-24 Phytotherapist
2 Kristine Travis Male bthompson@example.com 1992-07-02 Homeopath
3 Yesenia Martinez Male kaitlinkaiser@example.com 2017-08-03 Market researcher
4 Lori Todd Male buchananmanuel@example.net 1938-12-01 Veterinary surgeon
5 Erin Day Male tconner@example.org 2015-10-28 Management officer
6 Katherine Buck Female conniecowan@example.com 1989-01-22 Analyst
7 Ricardo Hinton Male wyattbishop@example.com 1924-03-26 Hydrogeologist
Dataset After skipping rows:
First Name Last Name Sex Email Date of birth Job Title
0 Shelby Terrell Male elijah57@example.net 1945-10-26 Games developer
1 Kristine Travis Male bthompson@example.com 1992-07-02 Homeopath
2 Yesenia Martinez Male kaitlinkaiser@example.com 2017-08-03 Market researcher
3 Lori Todd Male buchananmanuel@example.net 1938-12-01 Veterinary surgeon
4 Katherine Buck Female conniecowan@example.com 1989-01-22 Analyst
5 Ricardo Hinton Male wyattbishop@example.com 1924-03-26 Hydrogeologist

RELATED ARTICLES

Most Popular

Recent Comments