Get values of all rows in a particular column in openpyxl – Python

27 July 2024

0

In this article, we will explore how to get the values of all rows in a particular column in a spreadsheet using openpyxl in Python. We will start by discussing the basics of openpyxl and how to install and import it. Then, we will walk through for example, how to extract the values of a particular column from a spreadsheet and store them in a list.

To install openpyxl using pip, open a terminal window or command prompt and enter the following command:

pip install openpyxl

Raw Spreadsheet Data

For example, suppose you have a spreadsheet with the following data:

Name	Age	Gender
John	30	Male
Jane	25	Female
Bob	35	Male
Alice	28	Female

Extracting the Values of a Particular Column

Now that you know how to read a spreadsheet and access the rows and cells in a sheet, you can extract the values of a particular column from the sheet. To do this, you can simply iterate over the rows in the sheet and check the value of the cell in the column you are interested in. You can then store the values in a list or other data structure for further processing.

Example 1

To extract the values of the “Name” column, you could use the following code:

Python3

# Import openpyxl 
import openpyxl 
  
# Open the spreadsheet 
workbook = openpyxl.load_workbook("data.xlsx") 
  
# Get the first sheet 
sheet = workbook.worksheets[0] 
  
# Create a list to store the values 
names = [] 
  
# Iterate over the rows in the sheet 
for row in sheet: 
    # Get the value of the first cell 
    # in the row (the "Name" cell) 
    name = row[0].value 
    # Add the value to the list 
    names.append(name) 
  
# Print the list of names 
print(names) 

Output:

['Name', 'John', 'Jane', 'Bob', 'Alice']

Example 2

Note that the first row in the sheet contains the column names, so it is included in the list. If you want to exclude the column names from the list, you can add a check to skip the first row:

Python3

# Import openpyxl 
import openpyxl 
  
# Open the spreadsheet 
workbook = openpyxl.load_workbook("data.xlsx") 
  
# Get the first sheet 
sheet = workbook.worksheets[0] 
  
# Create a list to store the values 
names = [] 
  
# Iterate through rows 
for i, row in enumerate(sheet): 
    # Skip the first row (the row with the column names) 
    if i == 0: 
        continue
    # Get the value of the first cell in the row 
    name = row[0].value 
    # Add the value to the list 
    names.append(name) 
  
# Print the list of names 
print(names) 

Output:

['John', 'Jane', 'Bob', 'Alice']

Extracting the Values of a Particular Column Name

In the previous example, we extracted the values of a particular column by specifying the index of the column. However, in some cases, it may be more convenient to extract the values of a column by its name, rather than its index. To do this, you can use the iter_cols() method of the Sheet object to iterate over the columns in the sheet, and check the value of the first cell in each column to determine the column name.

Example 1

To extract the values of the “Name” column by its name, you could use the following code:

Python3

# Import openpyxl 
import openpyxl 
  
# Open the spreadsheet 
workbook = openpyxl.load_workbook("data.xlsx") 
  
# Get the first sheet 
sheet = workbook.worksheets[0] 
  
# Create a list to store the values 
names = [] 
  
# Iterate through columns 
for column in sheet.iter_cols(): 
    # Get the value of the first cell in the 
    # column (the cell with the column name) 
    column_name = column[0].value 
    # Check if the column is the "Name" column 
    if column_name == "Age": 
        # Iterate over the cells in the column 
        for cell in column: 
            # Add the value of the cell to the list 
            names.append(cell.value) 
  
# Print the list of names 
print(names)

Output:

['Age', 30.0, 25.0, 35.0, 28.0]

Example 2

To exclude the column name from the list, you can add a check to skip the first cell:

Python3

# Import openpyxl 
import openpyxl 
  
# Open the spreadsheet 
workbook = openpyxl.load_workbook("data.xlsx") 
  
# Get the first sheet 
sheet = workbook.sheets[0] 
  
# Create a list to store the values 
names = [] 
  
# Iterate over the columns in the sheet 
for column in sheet.iter_cols(): 
    # Get the value of the first cell in the column  
    # (the cell with the column name) 
    column_name = column[0].value 
    # Check if the column is the "Name" column 
    if column_name == "Age": 
        # Iterate over the cells in the column 
        for i, cell in enumerate(column): 
            # Skip the first cell (the cell with the column name) 
            if i == 0: 
                continue
            # Add the value of the cell to the list 
            names.append(cell.value) 
  
# Print the list of names 
print(names) 

Output:

[30.0, 25.0, 35.0, 28.0]

Get values of all rows in a particular column in openpyxl – Python

Raw Spreadsheet Data

Extracting the Values of a Particular Column

Python3

Python3

Extracting the Values of a Particular Column Name

Python3

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

One UI 7: Everything you need to know

Review: The Ulefone Armor Mini 20T Pro makes other rugged phones seem flimsy

Best midrange Android phones in 2024

I tried a Xiaomi mid-ranger for the first time in years, and I’m glad the Pixel 8a exists in the US

Recent Comments

EDITOR PICKS

One UI 7: Everything you need to know

Review: The Ulefone Armor Mini 20T Pro makes other rugged phones seem flimsy

Best midrange Android phones in 2024

POPULAR POSTS

One UI 7: Everything you need to know

Review: The Ulefone Armor Mini 20T Pro makes other rugged phones seem flimsy

Best midrange Android phones in 2024

POPULAR CATEGORY

ABOUT US

FOLLOW US