Python – Random Sample Training and Test Data from dictionary

26 July 2024

0

Sometimes, while working with Machine Learning Algorithm, we can have problem in which we need to differentiate the training and testing data randomly. This is very common problem and solution to it is desirable for Machine Learning domains. This article discusses approach to solve this without using external libraries.

Method : Using keys() + random.randint() + computations This problem can be solved by using combination of above functions. In this, we perform the task of extraction of random keys using randint(), from the keys extracted using keys(). The logical computations are performed for getting the separated test and training data.

Python3

# Python3 code to demonstrate working of 
# Random Sample Training and Test Data
# Using keys() + randint() + computations
import random
 
# initializing dictionary
test_dict = {'gfg' : 4, 'is' : 12, 'best' : 6, 'for' : 7, 'Lazyroar' : 10}
 
# printing original dictionary
print("The original dictionary is : " + str(test_dict))
 
# initializing ratio
test = 40
training = 60
 
# Random Sample Training and Test Data
# Using keys() + randint() + computations
key_list = list(test_dict.keys())
 
test_key_count = int((len(key_list) / 100) * test)
test_keys = [random.choice(key_list) for ele in range(test_key_count)]
train_keys = [ele for ele in key_list if ele not in test_keys]
 
testing_dict = dict((key, test_dict[key]) for key in test_keys 
                                        if key in test_dict) 
training_dict = dict((key, test_dict[key]) for key in train_keys 
                                        if key in test_dict) 
 
# printing result 
print("The testing dictionary is : " + str(testing_dict)) 
print("The training dictionary is : " + str(training_dict)) 

Output :

The original dictionary is : {‘is’: 12, ‘gfg’: 4, ‘best’: 6, ‘for’: 7, ‘Lazyroar’: 10} The testing dictionary is : {‘is’: 12, ‘for’: 7} The training dictionary is : {‘gfg’: 4, ‘best’: 6, ‘Lazyroar’: 10}

Time Complexity: O(n*n), where n is the length of the list test_dict
Auxiliary Space: O(n) additional space of size n is created where n is the number of elements in the res list

Python – Random Sample Training and Test Data from dictionary

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Interview With Bill Reed – CEO at RemotelyMe by Shauli Zacks

Samsung’s Galaxy S24 FE plummets to the price it should have been at launch

Samsung’s new periscope camera fits telephoto lenses into an even slimmer design

OnePlus’ decision to ditch Samsung’s OLED screens could backfire in the US

Recent Comments

EDITOR PICKS

Interview With Bill Reed – CEO at RemotelyMe by Shauli Zacks

Samsung’s Galaxy S24 FE plummets to the price it should have been at launch

Samsung’s new periscope camera fits telephoto lenses into an even slimmer design

POPULAR POSTS

Interview With Bill Reed – CEO at RemotelyMe by Shauli Zacks

Samsung’s Galaxy S24 FE plummets to the price it should have been at launch

Samsung’s new periscope camera fits telephoto lenses into an even slimmer design

POPULAR CATEGORY

ABOUT US

FOLLOW US