Saturday, September 21, 2024
Google search engine
HomeLanguagesCreate PySpark dataframe from dictionary

Create PySpark dataframe from dictionary

In this article, we are going to discuss the creation of Pyspark dataframe from the dictionary. To do this spark.createDataFrame() method method is used. This method takes two argument data and columns. The data attribute will contain the dataframe and the columns attribute will contain the list of columns name.

Example 1: Python code to create the student address details and convert them to dataframe

Python3




# importing module
import pyspark
  
# importing sparksession from 
# pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving 
# an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
# list  of college data with  dictionary
data = [{'student_id': 12, 'name': 'sravan',
         'address': 'kakumanu'}]
  
# creating a dataframe
dataframe = spark.createDataFrame(data)
  
# show data frame
dataframe.show()


Output:

Example2: Create three dictionaries and pass them to the data frame in pyspark

Python3




# importing module
import pyspark
  
# importing sparksession from 
# pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving 
# an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
# list  of college data with  dictionary 
# with three  dictionaries
data = [{'student_id': 12, 'name': 'sravan', 'address': 'kakumanu'},
        {'student_id': 14, 'name': 'jyothika', 'address': 'tenali'},
        {'student_id': 11, 'name': 'deepika', 'address': 'repalle'}]
  
# creating a dataframe
dataframe = spark.createDataFrame(data)
  
# show data frame
dataframe.show()


Output:

RELATED ARTICLES

Most Popular

Recent Comments