CSV stands for Comma Separated Values and CSV files are essentially text files which are used to store data in a tabular fashion using commas (,) as delimiters. CSV is a file format and all the files of this format are stored with a .csv extension. It is a very popular and extensively used format for storing the data in a structured form. CSV files find a lot of applications in Machine Learning and Statistical Models. Python has a library dedicated to deal with operations catering to CSV files such as reading, writing, or modifying them. Following is an example of how a CSV file looks like.
This article deals with the different ways to get column names from CSV files using Python. The following approaches can be used to accomplish the same :
- Using Python’s CSV library to read the CSV file line and line and printing the header as the names of the columns
- Reading the CSV file as a dictionary using DictReader and then printing out the keys of the dictionary
- Converting the CSV file to a data frame using the Pandas library of Python
Method 1:
Using this approach, we first read the CSV file using the CSV library of Python and then output the first row which represents the column names.
Python3
# importing the csv library import csv # opening the csv file by specifying # the location # with the variable name as csv_file with open ( 'data.csv' ) as csv_file: # creating an object of csv reader # with the delimiter as , csv_reader = csv.reader(csv_file, delimiter = ',' ) # list to store the names of columns list_of_column_names = [] # loop to iterate through the rows of csv for row in csv_reader: # adding the first row list_of_column_names.append(row) # breaking the loop after the # first iteration itself break # printing the result print ( "List of column names : " , list_of_column_names[ 0 ]) |
Output:
List of column names : ['Column1', 'Column2', 'Column3']
Method 2:
Under the second approach, we use the DictReader function of the CSV library to read the CSV file as a dictionary. We can simply use keys() method to get the column names.
Steps :
- Open the CSV file using DictReader.
- Convert this file into a list.
- Convert the first row of the list to the dictionary.
- Call the keys() method of the dictionary and convert it into a list.
- Display the list.
Python3
# importing the csv library import csv # opening the csv file with open ( 'data.csv' ) as csv_file: # reading the csv file using DictReader csv_reader = csv.DictReader(csv_file) # converting the file to dictionary # by first converting to list # and then converting the list to dict dict_from_csv = dict ( list (csv_reader)[ 0 ]) # making a list from the keys of the dict list_of_column_names = list (dict_from_csv.keys()) # displaying the list of column names print ( "List of column names : " , list_of_column_names) |
Output :
List of column names : ['Column1', 'Column2', 'Column3']
Method 3:
Under this approach, we read the CSV file as a data frame using the pandas library of Python. Then, we just call the column’s method of the data frame.
Python3
# importing the pandas library import pandas as pd # reading the csv file using read_csv # storing the data frame in variable called df df = pd.read_csv( 'data.csv' ) # creating a list of column names by # calling the .columns list_of_column_names = list (df.columns) # displaying the list of column names print ( 'List of column names : ' , list_of_column_names) |
Output :
List of column names : ['Column1', 'Column2', 'Column3']
The Data Frame looks as follows :