Saturday, September 6, 2025
HomeLanguagesHow to select the rows of a dataframe using the indices of...

How to select the rows of a dataframe using the indices of another dataframe?

Prerequisites: 

Using Pandas module it is possible to select rows from a data frame using indices from another data frame. This article discusses that in detail. It is advised to implement all the codes in jupyter notebook for easy implementation. 

Approach:

  • Import module
  • Create first data frame. In the example given below choice(), randint() and random() all belonging to random module are used to generate a data frame.

1) choice() – choice() is an inbuilt function in Python programming language that returns a random item from a list, tuple, or string.

Syntax: random.choice(sequence)

Parameters: Sequence is a mandatory parameter that can be a list, tuple, or string.

Returns:  The choice() returns a random item.

2) randint()- This function is used to generate random numbers

Syntax : randint(start, end)

Parameters :

(start, end) : Both of them must be integer type values.

Returns :

A random integer in range [start, end] including the end points.

3) random()- Used to generate floating numbers between 0 and 1.

  • Create another data frame using the random() function and randomly selecting the rows of the first dataset.
  • Now we will use dataframe.loc[] function to select the row values of the first data frame using the indexes of the second data frame. Pandas DataFrame.loc[] attribute access a group of rows and columns by label(s) or a boolean array in the given DataFrame.

Syntax: DataFrame.loc

Parameter : None

Returns : Scalar, Series, DataFrame

  • Display selected rows

Implementation using the above concept is given below:

Program:

Python3




# Importing Required Libraries
import pandas as pd
import random
 
# Creating data for main dataframe
col1 = [random.randint(1, 9) for i in range(15)]
col2 = [random.random() for i in range(15)]
col3 = [random.choice(['Geeks', 'of', 'Computer', 'Science'])
        for i in range(15)]
col4 = [random.randint(1, 9) for i in range(15)]
col5 = [random.randint(1, 9) for i in range(15)]
 
# Defining Column name for main dataframe
data_generated = {
    'value1': col1,
    'value2': col2,
    'value3': col3,
    'value4': col4,
    'value5': col5
}
 
# Creating the dataframe using DataFrame() function
print("First data frame")
dataframe = pd.DataFrame(data_generated)
display(dataframe)
 
# Creating a second dataframe that will be the subset of main dataframe
print("Second data frame")
dataframe_second = dataframe[['value1', 'value2', 'value3']].sample(n=4)
display(dataframe_second)
 
# Rows of a dataframe using the indices of another dataframe
print("selecting rows of first dataframe using second dataframe")
display(dataframe.loc[dataframe_second.index])


Output:

Previous article
Next article
RELATED ARTICLES

Most Popular

Dominic
32270 POSTS0 COMMENTS
Milvus
82 POSTS0 COMMENTS
Nango Kala
6639 POSTS0 COMMENTS
Nicole Veronica
11804 POSTS0 COMMENTS
Nokonwaba Nkukhwana
11869 POSTS0 COMMENTS
Shaida Kate Naidoo
6753 POSTS0 COMMENTS
Ted Musemwa
7029 POSTS0 COMMENTS
Thapelo Manthata
6705 POSTS0 COMMENTS
Umr Jansen
6721 POSTS0 COMMENTS