How to read a CSV file to a Dataframe with custom delimiter in Pandas?

28 July 2024

1

Python is a good language for doing data analysis because of the amazing ecosystem of data-centric python packages. pandas package is one of them and makes importing and analyzing data so much easier.
Here, we will discuss how to load a csv file into a Dataframe. It is done using a pandas.read_csv() method. We have to import pandas library to use this method.

Syntax: pd.read_csv(filepath_or_buffer, sep=’, ‘, delimiter=None, header=’infer’, names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, iterator=False, chunksize=None, compression=’infer’, thousands=None, decimal=b’.’, lineterminator=None, quotechar=’”‘, quoting=0, escapechar=None, comment=None, encoding=None, dialect=None, tupleize_cols=None, error_bad_lines=True, warn_bad_lines=True, skipfooter=0, doublequote=True, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None)

Some Useful parameters are given below :

Parameter	Use
filepath_or_buffer	URL or Dir location of file
sep	Stands for separator, default is ‘, ‘ as in csv(comma separated values)
index_col	This parameter is use to make passed column as index instead of 0, 1, 2, 3…r
header	This parameter is use to make passed row/s[int/int list] as header
use_cols	This parameter is Only uses the passed col[string list] to make data frame
squeeze	If True and only one column is passed then returns pandas series
skiprows	This parameter is use to skip passed rows in new data frame
skipfooter	This parameter is use to skip Number of lines at bottom of file

This method uses comma ‘, ‘ as a default delimiter but we can also use a custom delimiter or a regular expression as a separator.
For downloading the csv files Click Here
Example 1 : Using the read_csv() method with default separator i.e. comma(, )

Python3

# Importing pandas library
import pandas as pd
 
# Using the function to load
# the data of example.csv
# into a Dataframe df
df = pd.read_csv('example1.csv')
 
# Print the Dataframe
df

Output:

Example 2: Using the read_csv() method with ‘_’ as a custom delimiter.

Python3

# Importing pandas library
import pandas as pd
 
# Load the data of example.csv
# with '_' as custom delimiter
# into a Dataframe df
df = pd.read_csv('example2.csv',
                   sep = '_',
                   engine = 'python')
 
# Print the Dataframe
df

Output:

Note:While giving a custom specifier we must specify engine=’python’ otherwise we may get a warning like the one given below:

Example 3 : Using the read_csv() method with tab as a custom delimiter.

Python3

# Importing pandas library
import pandas as pd
 
# Load the data of example.csv
# with tab as custom delimiter
# into a Dataframe df
df = pd.read_csv('example3.csv',
                   sep = '\t',
                   engine = 'python')
 
# Print the Dataframe
df

Output:

Example 4 : Using the read_csv() method with regular expression as custom delimiter.
Let’s suppose we have a csv file with multiple type of delimiters such as given below.

totalbill_tip, sex:smoker, day_time, size
16.99, 1.01:Female|No, Sun, Dinner, 2
10.34, 1.66, Male, No|Sun:Dinner, 3
21.01:3.5_Male, No:Sun, Dinner, 3
23.68, 3.31, Male|No, Sun_Dinner, 2
24.59:3.61, Female_No, Sun, Dinner, 4
25.29, 4.71|Male, No:Sun, Dinner, 4

To load such file into a dataframe we use regular expression as a separator.

Python3

# Importing pandas library
import pandas as pd
 
# Load the data of example.csv
# with regular expression as
# custom delimiter into a
# Dataframe df
df = pd.read_csv('example4.csv',
                   sep = '[:, |_]',
                   engine = 'python')
 
# Print the Dataframe
df

Output:

How to read a CSV file to a Dataframe with custom delimiter in Pandas?

Python3

Python3

Python3

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Vietnam’s Success in Software Outsourcing

Install Python 3 / Python 2.7 on Rocky Linux 8 |AlmaLinux 8

How To Manage Angular JS Projects using Angular CLI

How To Install PHP 8.2 on Ubuntu 22.04|20.04|18.04

Recent Comments

EDITOR PICKS

Vietnam’s Success in Software Outsourcing

Install Python 3 / Python 2.7 on Rocky Linux 8 |AlmaLinux 8

How To Manage Angular JS Projects using Angular CLI

POPULAR POSTS

Vietnam’s Success in Software Outsourcing

Install Python 3 / Python 2.7 on Rocky Linux 8 |AlmaLinux 8

How To Manage Angular JS Projects using Angular CLI

POPULAR CATEGORY

ABOUT US

FOLLOW US